期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Paulo Sérgio Almeida Carlos Baquero Nuno Preguiça 《Information Processing Letters》2007,101(6):255-261

Bloom filters provide space-efficient storage of sets at the cost of a probability of false positives on membership queries. The size of the filter must be defined a priori based on the number of elements to store and the desired false positive probability, being impossible to store extra elements without increasing the false positive probability. This leads typically to a conservative assumption regarding maximum set size, possibly by orders of magnitude, and a consequent space waste. This paper proposes Scalable Bloom Filters, a variant of Bloom filters that can adapt dynamically to the number of elements stored, while assuring a maximum false positive probability. 相似文献

2.

A new analysis of the false positive rate of a Bloom filter

Ken Christensen Allen Roginsky 《Information Processing Letters》2010,110(21):944-532

A Bloom filter is a space-efficient data structure used for probabilistic set membership testing. When testing an object for set membership, a Bloom filter may give a false positive. The analysis of the false positive rate is a key to understanding the Bloom filter and applications that use it. We show experimentally that the classic analysis for false positive rate is wrong. We formally derive a correct formula using a balls-and-bins model and show how to numerically compute the new, correct formula in a stable manner. We also prove that the new formula always results in a predicted greater false positive rate than the classic formula. This correct formula is numerically compared to the classic formula for relative error - for a small Bloom filter the prediction of false positive rate will be in error when the classic formula is used. 相似文献

3.

基于抽样和Bloom Filters的长流检测

刘元珍刘渊李小航《小型微型计算机系统》2009,30(3)

长流检测对网络检测和管理有着重要的意义.提出一种基于抽样和Bloom Filters的长流检测算法,首先对报文进行抽样,然后通过Bloom Filters哈希运算,在内存中用临时表和流信息表来判断到达阈值的流并维护其信息,满足了高速网络环境下长流检测的要求,在保证测量精度的同时有效得控制了资源消耗.实验分析表明,和已有的方法相比,具有简单易行、资源可控等优点. 相似文献

4.

Bloom filter applications in network security: A state-of-the-art survey

Shahabeddin Geravand Mahmood Ahmadi 《Computer Networks》2013,57(18):4047-4064

Undoubtedly, dealing with security issues is one of the most important and complex tasks various networks face today. A large number of security algorithms have been proposed to enhance security in various types of networks. Many of these solutions are either directly or indirectly based on Bloom filter (BF), a space- and time-efficient probabilistic data structure introduced by Burton Bloom in 1970. Obviously, Bloom filters and their variants are getting more and more consideration in network security area. This paper provides an up-to-date survey of the application of BFs and their variants to improve performance of the approaches proposed to address security problems with different types of networks. 相似文献

5.

基于Bloom Filter的网页去重算法

徐娜刘四维汪翔倪卫明《微型电脑应用》2011,27(3):48-51,6

现在的互联网中存在网页重复的问题,这些问题将会使数据挖掘,搜索的复杂度加大。现有技术一些不足之处,针对互联网中的重复网页采用基于Bloom Filter的网页去重算法。使用了现有的网页去杂算法,对网页进行预处理,同时利用Bloom Filter结构大大降低了网页去重算法的时间复杂度和空间复杂度。从网页中提炼出表示网页特征的一些长句,从而把网页去重过程转换为一个搜索长句的过程,使用Bloom Filter减小了算法的时间复杂度。相似文献

6.

基于压缩布鲁姆过滤器的WSNs广播认证协议

张品秀梁春英蔡玉生付明明《传感器与微系统》2010,29(4)

提出一种轻量级的可扩展的传感器网络广播认证协议,在多级μTESLA中使用了压缩布鲁姆过滤器,取代了由Merkel树实现的对多级μTESLA密钥链进行绑定的过程.该协议具有持续时间长、自愈等属性,通过在每个接收端分配适当的存储空间,极大程度减少了通信负载.并结合类似公钥加密技术(PKC)计算模式的椭圆曲线加密(ECC)算法,实现分析结果表明:该协议可以保证长期的安全性,并能减少能量消耗. 相似文献

7.

In-place random list permutations

Wen-Ping Hwang Ching-Lin Wang 《Information Processing Letters》1999,70(6):255

We give two algorithms to randomly permute a linked list of length n in place using O(nlogn) time and O(logn) stack space in both the expected case and the worst case. The first algorithm uses well-known sequential random sampling, and the second uses inverted sequential random sampling. 相似文献

8.

On the false-positive rate of Bloom filters 总被引：1，自引：0，他引：1

Prosenjit Bose Hua Guo Anil Maheshwari Jason Morrison Yihui Tang 《Information Processing Letters》2008,108(4):210-213

Bloom filters are a randomized data structure for membership queries dating back to 1970. Bloom filters sometimes give erroneous answers to queries, called false positives. Bloom analyzed the probability of such erroneous answers, called the false-positive rate, and Bloom's analysis has appeared in many publications throughout the years. We show that Bloom's analysis is incorrect and give a correct analysis. 相似文献

9.

偏态数据流中的Bloom Filters自适应机制研究 总被引：1，自引：0，他引：1

张伟王绍棣《小型微型计算机系统》2008,29(9)

针对Count Bloom Filters(CBF)在对偏态分布的网络数据流进行频度检测时,其使用的固定位数的计数器容易溢出的不足,提出了一种自适应性Bloom Filters(Adaptive Bloom Filters ABF),ABF使用可扩展的逻辑计数器替代CBF中大小固定的物理计数器进行计数,逻辑计数器由数目动态变化的若干个物理计数器组成,初始状态逻辑计数器等同于物理计数器,但逻辑计数器在频度数值上溢时会自适应扩展,覆盖其外部的物理计数器,增加数值容量,保证数值的测量准确性.实验表明ABF能够更好地适应检测频度的变化,并且不显著增加误判率,在对数据偏态分布的频度测量场合比其它Count Bloom Filters更具有优势. 相似文献

10.

Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees

Bishnu Bhattacharyya 《Information Processing Letters》2008,108(5):293-297

The length-constrained heaviest path (LCHP) in a weighted tree T, where each edge is assigned a weight and a length, is the path P in T with maximum total path weight and total path length bounded by a given value B. This paper presents an O(nlogn) time LCHP algorithm which utilizes a data structure constructed from the spine decomposition of the input tree. This is an improvement over the existing algorithm by Wu et al. (1999), which runs in O(nlog²n) time. Our method also improves on a previous O(nlogn) time algorithm by Kim (2005) for the special case of finding a longest nonnegative path in a constant degree tree in that we can handle trees of arbitrary degree within the same time bounds. 相似文献

11.

Computing the subset partial order for dense families of sets

Amr Elmasry 《Information Processing Letters》2009,109(18):1082-1086

We give an algorithm to compute the subset partial order (called the subset graph) for a family F of sets containing k sets with N elements in total and domain size n. Our algorithm requires O(nk²/logk) time and space on a Pointer Machine. When F is dense, i.e. N=Θ(nk), the algorithm requires O(N²/log²N) time and space. We give a construction for a dense family whose subset graph is of size Θ(N²/log²N), indicating the optimality of our algorithm for dense families. The subset graph can be dynamically maintained when F undergoes set insertions and deletions in O(nk/logk) time per update (that is sub-linear in N for the case of dense families). If we assume words of b?k bits, allow bits to be packed in words, and use bitwise operations, the above running time and space requirements can be reduced by a factor of blog(k/b+1)/logk and b²log(k/b+1)/logk respectively. 相似文献

12.

Bloom filter-based discovery protocol for DDS middleware

Javier Sanchez-MonederoAuthor Vitae Javier Povedano-Molina^{Author Vitae} 《Journal of Parallel and Distributed Computing》2011,71(10):1305-1317

The Data Distribution Service (DDS) middleware has recently been standardized by the OMG. Prior to data communication, a discovery protocol had to locate and obtain remote DDS entities and their attributes. Specifically, DDS discovery matches the DataWriters (DWs) and DataReaders (DRs) entities (Endpoints) situated in different network nodes. DDS specification does not specify how this discovery is translated “into the wire”. To provide interoperability and transparency between different DDS implementations, the OMG has standardized the DDS Interoperability Wire Protocol (DDS-RTPS). Any compliant DDS-RTPS implementation must support at least the SDP (Simple Discovery Protocol). The SDP works in relatively small or medium networks but it may not scale as the number of DDS Endpoints increases. This paper addresses the design and evaluation of an SDP alternative-which uses Bloom Filters (BF)-that increases DDS scalability. BFs use Hash functions for space-efficient probabilistic data set representation. We provide both analytical and experimental studies. Results show that our approach can improve the discovery process (in terms of network load and node resource consumption), especially in those scenarios with large Endpoint per Participant ratios. 相似文献

13.

Double hashing with passbits

Walter A. Burkhard 《Information Processing Letters》2005,96(5):162-166

Double hashing with bucket capacity one is augmented with multiple passbits to obtain significant reduction to unsuccessful search lengths. This improves the analysis of Martini et al. [P.M. Martini, W.A. Burkhard, Double hashing with multiple passbits, Internat. J. Found. Theoret. Comput. Sci. 14 (6) (2003) 1165-1188] by providing a closed form expression for the expected unsuccessful search lengths. 相似文献

14.

Finding lowest common ancestors in arbitrarily directed trees

Matti Nyknen Esko Ukkonen 《Information Processing Letters》1994,50(6):307-310

It is well known how to preprocess a rooted tree in linear time to yield the lowest common ancestor of any given pair of nodes in constant time. We generalize these algorithms for graphs called Arbitrarily Directed Trees, or ADTs. Such trees are obtained by reversing the directions of some edges in trees with the customary “root-to-leaf” orientation of edges. 相似文献

15.

The longest almost-increasing subsequence

Amr Elmasry 《Information Processing Letters》2010,110(16):655-658

Given a sequence of n elements, we introduce the notion of an almost-increasing subsequence as the longest subsequence that can be converted to an increasing subsequence by possibly adding a value, that is at most a fixed constant, to each of the elements. We show how to optimally construct such subsequence in time, where k is the length of the output subsequence. 相似文献

16.

Towards optimal range medians

Gerth Stølting Brodal Allan Grønlund Jørgensen 《Theoretical computer science》2011,412(24):2588-2601

We consider the following problem: Given an unsorted array of n elements, and a sequence of intervals in the array, compute the median in each of the subarrays defined by the intervals. We describe a simple algorithm which needs O(nlogk+klogn) time to answer k such median queries. This improves previous algorithms by a logarithmic factor and matches a comparison lower bound for k=O(n). The space complexity of our simple algorithm is O(nlogn) in the pointer machine model, and O(n) in the RAM model. In the latter model, a more involved O(n) space data structure can be constructed in O(nlogn) time where the time per query is reduced to O(logn/loglogn). We also give efficient dynamic variants of both data structures, achieving O(log²n) query time using O(nlogn) space in the comparison model and O((logn/loglogn)²) query time using O(nlogn/loglogn) space in the RAM model, and show that in the cell-probe model, any data structure which supports updates in O(log^O(1)n) time must have Ω(logn/loglogn) query time.Our approach naturally generalizes to higher-dimensional range median problems, where element positions and query ranges are multidimensional—it reduces a range median query to a logarithmic number of range counting queries. 相似文献

17.

The Level-Ancestor problem on Pure Pointer Machines

Desh Ranjan 《Information Processing Letters》2003,85(5):275-283

For a tree T the Level-Ancestor problem is the following: given a node x and an integer depth d, find the ancestor of x in T that is at depth d in T. We formulate this problem precisely for the Pure Pointer Machine model, i.e., pointer machines without arithmetic capabilities. We then provide a solution for the problem in the fully dynamic case that runs in worst-case O(lgn) time per operation. We also prove a matching lower bound of to solve the problem. 相似文献

18.

Improving the performance of the BioHEL learning classifier system

Xiao-Lei Xia Huanlai Xing 《Expert systems with applications》2013,40(15):6019-6032

The identification of significant attributes is of major importance to the performance of a variety of Learning Classifier Systems including the newly-emerged Bioinformatics-oriented Hierarchical Evolutionary Learning (BioHEL) algorithm. However, the BioHEL fails to deliver on a set of synthetic datasets which are the checkerboard data mixed with Gaussian noises due to the fact the significant attributes were not successfully recognised. To address this issue, a univariate Estimation of Distribution Algorithm (EDA) technique is introduced to BioHEL which primarily builds a probabilistic model upon the outcome of the generalization and specialization operations. The probabilistic model which estimates the significance of each attribute provides guidance for the exploration of the problem space. Experiment evaluations showed that the proposed BioHEL systems achieved comparable performance to the conventional one on a number of real-world small-scale datasets. Research efforts were also made on finding the optimal parameter for the traditional and proposed BioHEL systems. 相似文献

19.

Universal hash functions for an infinite universe and hash trees 总被引：1，自引：0，他引：1

Peter Brass 《Information Processing Letters》2009,109(10):461-462

In this note we describe the adaptation of the universal hashing idea to an infinite universe, and to prefix hash trees. These are a structure underlying all extendible hash methods, which have up to now only been studied under the uniform hashing assumption. 相似文献

20.

A direct algorithm for restricted rotation distance

Joan M. Lucas 《Information Processing Letters》2004,90(3):129-134

In this paper we present a concise O(n) implementation of Cleary's algorithm for generating a sequence of restricted rotations between any two binary trees. The algorithm is described directly in terms of the binary trees, without using any intermediate representation. 相似文献