首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
In this paper, a novel discovery scheme using modified counting Bloom filters is proposed for data distribution service (DDS) for real-time distributed system. In a large scale network for combat management system (CMS), a lot of memory is required to store all the information. In addition, high network traffic can become problematic. In many cases, most of the information stored is not needed by the DDS's endpoints but occupy memory storage. To reduce the size of information sent and stored, a discovery process combined with counting Bloom filters is proposed. This paper presents delay time for filters construction and total discovery time needed in DDS's discovery process. Simulation results show that the proposed method gives low delay time and zero false positive probability.  相似文献   

2.
One widely used mechanism for representing membership of a set of items is the simple space-efficient randomized data structure known as Bloom filters. Yet, Bloom filters are not entirely suitable for many new network applications that support network services like the representation and querying of items that have multiple attributes as opposed to a single attribute. In this paper, we present an approach to the accurate and efficient representation and querying of multiattribute items using Bloom filters. The approach proposes three variant structures of Bloom filters: Parallel Bloom Filter (referred as PBF) structure, PBF with a hash table (PBF-HT), and PBF with a Bloom filter (PBF-BF). PBF stores multiple attributes of an item in parallel Bloom filters. The auxiliary HT and BF provide functions to capture the inherent dependency of all attributes of an item. Compared to standard Bloom filters to represent items with multiple attributes, the proposed PBF facilitates much faster query service and both PBF-HT and PBF-BF structures achieve much lower false positive probability with a result to save storage space. Simulation and experimental results demonstrate that the new space-efficient Bloom filter structures can efficiently and accurately represent multiattribute items and quickly respond queries at the cost of a relatively small false positive probability.  相似文献   

3.
4.
A Bloom filter is a space-efficient data structure used for concisely representing a set as well as membership queries at the expense of introducing false positive. In this paper, we propose the L-priorities Bloom filter (LPBF) as a new member of the Bloom filter (BF) family, it uses a limited multidimensional bit space matrix to replace the bit vector of standard bloom filters in order to support different priorities for the elements of a set. We demonstrate the time and space complexity, especially the false positive rate of LPBF. Furthermore, we also present a detailed practical evaluation of the false positive rate achieved by LPBF. The results show that LPBF performs better than standard BFs with respect to false positive rate.  相似文献   

5.
A Bloom filter is an effective, space-efficient data structure for concisely representing a set, and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets, as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in stand-alone applications, as well as distributed applications.  相似文献   

6.
Where distributed agents must share voluminous set membership information, Bloom filters provide a compact, though lossy, way for them to do so. Numerous recent networking papers have examined the trade-offs between the bandwidth consumed by the transmission of Bloom filters, and the error rate, which takes the form of false positives. This paper is about the retouched Bloom filter (RBF). An RBF is an extension that makes the Bloom filter more flexible by permitting the removal of false positives, at the expense of introducing false negatives, and that allows a controlled trade-off between the two. We analytically show that creating RBFs through a random process decreases the false positive rate in the same proportion as the false negative rate that is generated. We further provide some simple heuristics that decrease the false positive rate more than the corresponding increase in the false negative rate, when creating RBFs. These heuristics are more effective than the ones we have presented in prior work. We further demonstrate the advantages of an RBF over a Bloom filter in a distributed network topology measurement application. We finally discuss several networking applications that could benefit from RBFs instead of standard Bloom filters.  相似文献   

7.
Bloom filters are extensively used in distributed applications, especially in distributed databases and distributed information systems, to reduce network requirements and to increase performance. In this work, we propose two novel Bloom filter features that are important for distributed databases and information systems. First, we present a new approach to encode a Bloom filter such that its length can be adapted to the cardinality of the set it represents, with negligible overhead with respect to computation and false positive probability. The proposed encoding allows for significant network savings in distributed databases, as it enables the participating nodes to optimize the length of each Bloom filter before sending it over the network, for example, when executing Bloom joins. Second, we show how to estimate the number of distinct elements in a Bloom filter, for situations where the represented set is not materialized. These situations frequently arise in distributed databases, where estimating the cardinality of the represented sets is necessary for constructing an efficient query plan. The estimation is highly accurate and comes with tight probabilistic bounds. For both features we provide a thorough probabilistic analysis and extensive experimental evaluation which confirm the effectiveness of our approaches.  相似文献   

8.
Invertible Bloom Lookup Tables (IBLTs) have been recently introduced as an extension of traditional Bloom filters. IBLTs store key-value pairs. Unlike traditional Bloom filters, IBLTs support both a lookup operation (given a key, return a value) and an operation that lists out all the key-value pairs stored. One issue with IBLTs is that there is a probability that a lookup operation will return “not found” for a key. In this paper, a technique to reduce this probability without affecting the storage requirement and only moderately increasing the search time is presented and evaluated. The results show that it can significantly reduce the probability of not returning a value that is actually stored in the IBLT. The overhead of the modified search procedure, compared to the standard IBLT search procedure, is small and has little impact on the average search time.  相似文献   

9.
On the false-positive rate of Bloom filters   总被引:1,自引:0,他引:1  
Bloom filters are a randomized data structure for membership queries dating back to 1970. Bloom filters sometimes give erroneous answers to queries, called false positives. Bloom analyzed the probability of such erroneous answers, called the false-positive rate, and Bloom's analysis has appeared in many publications throughout the years. We show that Bloom's analysis is incorrect and give a correct analysis.  相似文献   

10.
Synchronization between two sets is an important requirement for many distributed applications. A basic prerequisite is to find out which elements of set A are not in set B and vice versa. A very space efficient data structure for such membership queries that has been used a lot in networking applications is the Bloom filter. Unfortunately, the Bloom filter owes its high efficiency to the fact that there is a chance of false positives when querying the filter. This precludes the adoption of Bloom filters in applications that cannot tolerate such errors. In this paper we present an approach that augments Bloom filters with a trie-based mechanism that deterministically and efficiently finds the false positives after using the Bloom filter to synchronize two sets. We show that the added communication overhead for our approach is negligible compared to the overhead of a plain Bloom filter.  相似文献   

11.
典型Bloom过滤器的研究及其数据流应用   总被引:1,自引:0,他引:1       下载免费PDF全文
Bloom过滤器是一种空间高效但有一定假阳性的数据表示方法。该文分析比较计数型Bloom过滤器、光谱Bloom过滤器和动态计数过滤器的异同点及适用场合,介绍Bloom过滤器在重复项检测及频繁项挖掘中的应用,总结Bloom过滤器给数据流带来的挑战,包括元素突发问题及数据流相异元素数目变化问题。  相似文献   

12.
Bloom filter (BF) is a space-efficient data structure that represents a large set of items and supports efficient membership queries. It has been widely proposed to employ Bloom filters in the routing entries so as to facilitate data-centric routing in network applications. The existing designs of Bloom filters, however, cannot effectively support in-network queries. Given a query for a data item at a node in the network, the noise in unrelated routing entries very likely equals to the useful information of the item in the right routing entries. Consequently, the majority of queries are routed towards many wrong nodes besides those destinations, wasting large quantities of network traffic. To address this issue, we classified the existing designs as CUBF (Cumulative Bloom filters) and ABF (Aggregated Bloom filters), and then evaluate their performance in routing queries under the noisy environments. Based on the evaluation results, we propose a receiver-oriented design of Bloom filters to sufficiently restrict the probability of a wrong routing decision. Moreover, we significantly decrease the delay of a routing decision in the case of CUBF by using the bit slice approach, and reduce the transmission size of each BF in the case of ABF by using the compression approach. Both the theoretical analysis and experimental results demonstrate that our receiver-oriented design of Bloom filters apparently outperforms the existing approaches in terms of the success probability of routing and network traffic cost.  相似文献   

13.
Scalability is a key issue in datacenter multicast as it needs to support a large number of groups in commodity switches with limited fast memory. Previous in-packet Bloom filter-based datacenter multicast schemes have been proposed to address the scalability issue. They encode a multicast tree in a Bloom filter carried in each packet. However, these schemes induce high bandwidth overhead due to the false positives inherent in Bloom filters, and cannot scale well to the increasing variety of group sizes. In this paper, we propose an in-packet bitmap-based approach towards scalable datacenter multicast, improving the bandwidth efficiency. We use a bitmap to encode switch ports of a multicast tree in the packet header, eliminating false positive forwarding. In addition, we use a clustered Golomb coding method to compress in-packet bitmaps for further reducing the bandwidth overhead. Experimental results on simulations and a Click-based switch prototype demonstrate that our scheme achieves up to several orders of magnitude reductions in bandwidth overhead compared to previous schemes.  相似文献   

14.
Deep packet inspection using parallel bloom filters   总被引:2,自引:0,他引:2  
There is a class of packet processing applications that inspect packets deeper than the protocol headers to analyze content. For instance, network security applications must drop packets containing certain malicious Internet worms or computer viruses carried in a packet payload. Content forwarding applications look at the hypertext transport protocol headers and distribute the requests among the servers for load balancing. Packet inspection applications, when deployed at router ports, must operate at wire speeds. With networking speeds doubling every year, it is becoming increasingly difficult for software-based packet monitors to keep up with the line rates. We describe a hardware-based technique using Bloom filters, which can detect strings in streaming data without degrading network throughput. A Bloom filter is a data structure that stores a set of signatures compactly by computing multiple hash functions on each member of the set. This technique queries a database of strings to check for the membership of a particular string. The answer to this query can be false positive but never a false negative. An important property of this data structure is that the computation time involved in performing the query is independent of the number of strings in the database provided the memory used by the data structure scales linearly with the number of strings stored in it. Furthermore, the amount of storage required by the Bloom filter for each string is independent of its length.  相似文献   

15.
A Bloom filter is a space-efficient data structure used for probabilistic set membership testing. When testing an object for set membership, a Bloom filter may give a false positive. The analysis of the false positive rate is a key to understanding the Bloom filter and applications that use it. We show experimentally that the classic analysis for false positive rate is wrong. We formally derive a correct formula using a balls-and-bins model and show how to numerically compute the new, correct formula in a stable manner. We also prove that the new formula always results in a predicted greater false positive rate than the classic formula. This correct formula is numerically compared to the classic formula for relative error - for a small Bloom filter the prediction of false positive rate will be in error when the classic formula is used.  相似文献   

16.
许多应用场景所产生的数据流中,元素的频数分布符合重尾分布的特点,即大部分元素的频数较小而少部分元素的频数较大.为了解决数据流中所有相异元素及其频数的高效存储问题,提出了一个基于分层的计数型布卢姆过滤器(hierarchical counting Bloom filter,HCBF)保存所有元素频数的方法.该方法采用长度递减、计数单位递增的多层计数型布卢姆过滤器作为存储数据结构,多层过滤器共同组成元素的频数.与两个经典的计数型布卢姆过滤器CBF和DCF相比,HCBF更加适合真实数据流元素频数分布的重尾特点,在不影响查询性能和错误率的前提下,能够显著地降低空间开销.理论分析与实验结果验证了该结论.  相似文献   

17.
探讨双布鲁姆过滤器查询法查询集合并集、交集、补集、差集或对称差成员的性能问题。理论分析和实验结果表明,双布鲁姆过滤器查询法能够较好地支持集合并集、交集、补集、差集及对称差的成员查询问题,其中双布鲁姆过滤器并集及交集查询不会产生假阴性,仅有少量假阳性的存在,而双布鲁姆过滤器补集、差集及对称差查询则除存在少量假阳性外,还存在少量假阴性。  相似文献   

18.
Detecting duplicates in data streams is an important problem that has a wide range of applications. In general,precisely detecting duplicates in an unbounded data stream is not feasible in most streaming scenarios,and,on the other hand,the elements in data streams are always time sensitive. These make it particular significant approximately detecting duplicates among newly arrived elements of a data stream within a fixed time frame. In this paper,we present a novel data structure,Decaying Bloom Filter(DBF),as an extension of the Counting Bloom Filter,that effectively removes stale elements as new elements continuously arrive over sliding windows. On the DBF basis we present an efficient algorithm to approximately detect duplicates over sliding windows. Our algorithm may produce false positive errors,but not false negative errors as in many previous results. We analyze the time complexity and detection accuracy,and give a tight upper bound of false positive rate. For a given space G bits and sliding window size W,our algorithm has an amortized time complexity of O((G/W))~(1/2). Both analytical and experimental results on synthetic data demonstrate that our algorithm is superior in both execution time and detection accuracy to the previous results.  相似文献   

19.
A Bloom filter is a simple but powerful data structure that can check membership to a static set. As Bloom filters become more popular for network applications, a membership query for a dynamic set is also required. Some network applications require high-speed processing of packets. For this purpose, Bloom filters should reside in a fast and small memory, SRAM. In this case, due to the limited memory size, stale data in the Bloom filter should be deleted to make space for new data. Namely the Bloom filter needs aging like LRU caching. In this paper, we propose a new aging scheme for Bloom filters. The proposed scheme utilizes the memory space more efficiently than double buffering, the current state of the art. We prove theoretically that the proposed scheme outperforms double buffering. We also perform experiments on real Internet traces to verify the effectiveness of the proposed scheme.  相似文献   

20.
针对计数性布鲁姆过滤器存储数据时计数器溢出的缺陷,提出了一种基于分层计数型布鲁姆过滤器(hierarchy counting Bloom filter,HCBF)的大流检测机制。该方法结合溢出概率函数的特性,将计数型布鲁姆过滤器从一层扩展到多层,并能自适应地配置各层计数型布鲁姆过滤器的参数,能够对大流进行较好的识别。基于互联网数据进行了仿真实验,结果显示:与计数型布鲁姆过滤器相比,在同样溢出概率条件下,提高大流检测精度的同时节省了大量的内存资源。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号