首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
In this paper, a novel discovery scheme using modified counting Bloom filters is proposed for data distribution service (DDS) for real-time distributed system. In a large scale network for combat management system (CMS), a lot of memory is required to store all the information. In addition, high network traffic can become problematic. In many cases, most of the information stored is not needed by the DDS's endpoints but occupy memory storage. To reduce the size of information sent and stored, a discovery process combined with counting Bloom filters is proposed. This paper presents delay time for filters construction and total discovery time needed in DDS's discovery process. Simulation results show that the proposed method gives low delay time and zero false positive probability.  相似文献   

2.
Hash tables are widely used in network applications, as they can achieve O(1) query, insert, and delete operations at moderate loads. However, at high loads, collisions are prevalent in the table, which increases the access time and induces non-deterministic performance. Slow rates and non-determinism can considerably hurt the performance and scalability of hash tables in the multi-threaded parallel systems such as ASIC/FPGA and multi-core. So it is critical to keep the hash operations faster and more deterministic.This paper presents a novel fast collision-free hashing scheme using Discriminative Bloom Filters (DBFs) to achieve fast and deterministic hash table lookup. DBF is a compact summary stored in on-chip memory. It is composed of an array of parallel Bloom filters organized by the discriminator. Each element lookup performs parallel membership checks on the on-chip DBF to produce a possible discriminator value. Then, the element plus the discriminator value is hashed to a possible bucket in an off-chip hash table for validating the match. This DBF-based scheme requires one off-chip memory access per lookup as well as less off-chip memory usage. Experiments show that our scheme achieves up to 8.5-fold reduction in the number of off-chip memory accesses per lookup than previous schemes.  相似文献   

3.
Trie结构是一种使用搜索关键字来组织信息的搜索树,可用于高效地存储和搜索字符串集合.T.Nipkow给出了实现Trie的Isabelle建模与验证,然而其Trie在存储和操作时存在大量的冗余,导致空间利用率不高,且仅考虑英文单模式下查找.为此,本文基于索引即键值的思想提出的Trie+结构,相较于传统的索引与键值分开存储的结构能减少50%的存储空间,大大提高了空间利用率.并且,对Trie+结构的查找、插入、删除等操作给出了函数式建模及其严格的机械化验证,保证操作的正确性和可靠性.进一步,首次提出一种匹配算法的通用验证规约,旨在解决一系列的匹配算法正确性验证问题.最后,基于Trie+结构与匹配算法通用验证规约,建模和验证了函数式中英文混合多模式匹配算法,发现并解决了现有研究中的基于完全哈希Trie的多模式匹配算法的模式串前缀终止的Bug.所提的Trie+结构以及验证规约在提高Trie结构空间利用率和验证匹配算法中,有一定的理论和应用价值.  相似文献   

4.
为了提高IPv6的路由查找效率,针对IPv6路由前缀分布不均匀的问题,提出了一种基于B-树和Bloom filter相结合的IPv6路由查找算法(BTBF)。BTBF分为B-树和Bloom filter查找两部分,利用B-树查找路由前缀的前16 bit值,然后通过B-树节点中位向量的映射,将下一步链接到Bloom filter,再利用Bloom filter位数组的值映射提取下一跳。实验结果表明,BTBF算法与其他树型和Bloom filter类算法相比有效减少了空间和时间占用,在路由表项数变化较大的情况下也能维持稳定的查找性能。  相似文献   

5.
胡国良  林亚平  王刚  姚鑫 《计算机应用》2012,32(11):3132-3135
针对基于软件、硬件的深度数据包检测存在处理速度慢或规则集更新困难等方面的局限性,提出一种在多核平台上基于并行Bloom过滤器组的深度数据包检测算法。算法中首先将规则集按规则的长度分组,构造一个并行Bloom过滤器组,组中每个计数式 Bloom过滤器表示特定规则长度的规则集。为了减少执行过程中的冲突概率和计算量,构造了高性能的哈希函数,然后基于多核平台的并行处理能力使用并行编程实现了该算法。理论分析和实验结果表明该算法是一种时空高效的算法。  相似文献   

6.
Bloom filters provide space-efficient storage of sets at the cost of a probability of false positives on membership queries. The size of the filter must be defined a priori based on the number of elements to store and the desired false positive probability, being impossible to store extra elements without increasing the false positive probability. This leads typically to a conservative assumption regarding maximum set size, possibly by orders of magnitude, and a consequent space waste. This paper proposes Scalable Bloom Filters, a variant of Bloom filters that can adapt dynamically to the number of elements stored, while assuring a maximum false positive probability.  相似文献   

7.
时序相似性搜索是时序数据分析最基本的操作之一,具有广泛的应用场景.针对现有分布式算法无法应对维度增长、扫描范围过大和相似性计算耗时的问题,提出一种面向键值存储的分布式时序相似性搜索方法KV-Search.首先对时序数据分块,并设计其键值存入键值数据库,解决了时序数据维度高且不断增长的问题;其次,基于切比雪夫距离计算其下...  相似文献   

8.
Bloom filter (BF) is a space-efficient data structure that represents a large set of items and supports efficient membership queries. It has been widely proposed to employ Bloom filters in the routing entries so as to facilitate data-centric routing in network applications. The existing designs of Bloom filters, however, cannot effectively support in-network queries. Given a query for a data item at a node in the network, the noise in unrelated routing entries very likely equals to the useful information of the item in the right routing entries. Consequently, the majority of queries are routed towards many wrong nodes besides those destinations, wasting large quantities of network traffic. To address this issue, we classified the existing designs as CUBF (Cumulative Bloom filters) and ABF (Aggregated Bloom filters), and then evaluate their performance in routing queries under the noisy environments. Based on the evaluation results, we propose a receiver-oriented design of Bloom filters to sufficiently restrict the probability of a wrong routing decision. Moreover, we significantly decrease the delay of a routing decision in the case of CUBF by using the bit slice approach, and reduce the transmission size of each BF in the case of ABF by using the compression approach. Both the theoretical analysis and experimental results demonstrate that our receiver-oriented design of Bloom filters apparently outperforms the existing approaches in terms of the success probability of routing and network traffic cost.  相似文献   

9.
Ternary content addressable memories (TCAMs) are widely used in network devices carrying out the core operation of single-operation lookups. TCAMs are the core component of many networking devices such as routers, switches, firewalls and intrusion detection/prevention systems. Unfortunately, they are susceptible to errors caused by environmental factors such as radiation. TCAM errors may have significant impact on search results. In fact, only one error in a TCAM can cause 100 % of search keys to have wrong lookup results. Therefore, TCAM error detection and correction schemes are needed to enhance the reliability of TCAM-based systems. All prior solutions require hardware changes to TCAM circuitry and therefore are difficult to deploy. In this paper, we propose TCAMChecker, the first software-based solution for TCAM error detection and correction. Given a search key, TCAMChecker probabilistically decides to verify the lookup result. If TCAMChecker decides to verify the lookup result then it performs two parallel lookups for the given search key. If the lookup results do not match then at least one error is detected and is corrected by using a backup error-free memory. Note that the probability of lookup verification can be tuned for tradeoff between performance and reliability. A higher probability of lookup verification provides a more reliable TCAM system at the cost of performance. Our proposed TCAMChecker can be easily deployed on existing TCAM-based networking devices to improve the system reliability.  相似文献   

10.
LSM-trie-based key-value (KV) store is often used to manage an ultralarge dataset in reality by introducing a number of sublevels at each level, its linear growth pattern can fairly reduce the write amplification in store operations. Although this design is effective for the write operation, the last level holds a large proportion of KV items, leading to the extreme imbalance of data distribution. Therefore, to support efficient read, we need to carefully consider this imbalance. On the other hand, to ensure that acquired data is latest, the LSM-trie needs to search the dataset at different levels one by one, and this search method may take a lot of unnecessary time. When the number of items is ultralarge, the random lookup performance may be poor due to the imbalance data distribution. To address this issue, we improve the read performance of the LSM-trie by changing its serial search to parallel search, using two threads to simultaneously search at the last level and other levels, respectively. Our experiment results show that the read performance of the LSM-trie can be improved up to 98.35% and on average 71.55% .  相似文献   

11.
分布式键值存储将数据复制到多个存储服务器的本地引擎中,并通过一致性协议保证各副本数据的一致性。其中,以日志结构合并树为核心数据结构的实现方式最为常见。然而,面向通用业务模式设计的日志结构合并树,并不适合一致性逻辑的特殊业务模式,会引发增删改性能的降低,并在全量修复过程中造成空间放大。针对上述问题,该文提出了一种新型本地引擎 PheonixLSM,通过增加增删改操作和回刷操作的约束,消除了分布式键值存储增删改流程中的双写问题,提升了引擎性能。通过重构日志结构合并树底层的 SST 文件布局,支持删除实时回收空间,消除了全量修复时的额外空间放大。实验结果显示,与原生本地引擎相比,使用 PheonixLSM 的分布式键值存储系统,增删改性能提升 90.7%,全量修复的空间放大从 65.6% 降至 6.4%,并减少了 72.3% 的修复时间。  相似文献   

12.
Membership determination of text strings has been an important procedure for analyzing textual data of a tremendous amount, especially when time is a crucial factor. Bloom filter has been a well-known approach for dealing with such a problem because of its succinct structure and simple determination procedure. As determination of membership with classification is becoming increasingly desirable, parallel Bloom filters are often implemented for facilitating the additional classification requirement. The parallel Bloom filters, however, tend to produce additional false-positive errors since membership determination must be performed on each of the parallel layers. We propose a scheme based on CMAC, a neural network mapping, which only requires a single-layer calculation to simultaneously obtain information of both the membership and classification. A hash function specifically designed for text strings is also proposed. The proposed scheme could effectively reduce false-positive errors by converging the range of membership acceptance to the minimum for each class during the neural network mapping. Simulation results show that the proposed scheme committed significantly less errors than the benchmark, parallel Bloom filters, with limited and identical memory usage at different classification levels.  相似文献   

13.
One widely used mechanism for representing membership of a set of items is the simple space-efficient randomized data structure known as Bloom filters. Yet, Bloom filters are not entirely suitable for many new network applications that support network services like the representation and querying of items that have multiple attributes as opposed to a single attribute. In this paper, we present an approach to the accurate and efficient representation and querying of multiattribute items using Bloom filters. The approach proposes three variant structures of Bloom filters: Parallel Bloom Filter (referred as PBF) structure, PBF with a hash table (PBF-HT), and PBF with a Bloom filter (PBF-BF). PBF stores multiple attributes of an item in parallel Bloom filters. The auxiliary HT and BF provide functions to capture the inherent dependency of all attributes of an item. Compared to standard Bloom filters to represent items with multiple attributes, the proposed PBF facilitates much faster query service and both PBF-HT and PBF-BF structures achieve much lower false positive probability with a result to save storage space. Simulation and experimental results demonstrate that the new space-efficient Bloom filter structures can efficiently and accurately represent multiattribute items and quickly respond queries at the cost of a relatively small false positive probability.  相似文献   

14.
On the false-positive rate of Bloom filters   总被引:1,自引:0,他引:1  
Bloom filters are a randomized data structure for membership queries dating back to 1970. Bloom filters sometimes give erroneous answers to queries, called false positives. Bloom analyzed the probability of such erroneous answers, called the false-positive rate, and Bloom's analysis has appeared in many publications throughout the years. We show that Bloom's analysis is incorrect and give a correct analysis.  相似文献   

15.
Finding similar items in a large and unstructured dataset is a challenging task in many applications of data science, such as searching, indexing, and retrieval. With the increasing data volume and demand for real time responses, similarity search has gained much consideration. In this paper, a parallel computational approach for similarity search using Bloom filters (PCASSB) has been proposed, which uses Bloom filter for the representation of features of document and comparison with user's query. Query features are stored in integer query array (IQA), an array of integer. The PCASSB, an approximate similarity search technique, has been implemented on graphics processing unit with compute unified device architecture as the programming platform. To compute the similarity score between query and reference dataset, Dice coefficient has been used as a baseline method. The accuracy of the results generated by PCASSB is compared with the baseline method and other state‐of‐the‐art methods. The experimental results show that the proposed technique is quite effective in processing large number of text documents as it takes less computational time.  相似文献   

16.
17.
针对Hadoop Database(Hbase)仅支持主索引结构,即通过主键和主键的range来检索数据的问题,提出利用Counting Bloom Filter的新变体建立二级索引来支持非主键数据的检索.分析了已有的Counting Bloom Filter(CBF)技术,针对CBF溢出概率高的问题,提出一种新的Split Counting Bloom Filter(SCBF)技术,SCBF将标准CBF分成多个相互独立的区域,由这多个区域共同存储元素的fingerprint.实验结果表明,与标准CBF相比,SCBF降低了溢出概率,充分提高了过滤器的性能,可以很好地用来建立Hbase二级索引.  相似文献   

18.
In next-generation networks, packet classification is important in fulfilling the requirements of multimedia services, including VoIP and VoD. Using pre-defined filters, the incoming packets can be categorized that determines to which forwarding class a packet belongs. Packet classification is essentially a problem of multidimensional range matching. The tuple space search is a well-known solution based on multiple hash accesses for various filter length combinations. The tuple-based algorithm, a rectangle search, is highly scalable with respect to the number of filters; however, it suffers from the memory-explosion problem. Besides, the lookup performance of the rectangle search is not sufficiently fast to accomplish high-speed packet classification. This work proposes an improved scheme to reduce the required storage and realize OC-192 wire-speed forwarding. The scheme consists of two parts. The "Tuple Reduction Algorithm" drastically reduces the number of tuples by duplicating filters. Dynamic programming is used to optimize the tuple reduction and two heuristic approaches are introduced to simplify the optimization process. Furthermore, the "Look-ahead Caching" scheme is presented to improve the lookup performance. The basic idea is to prevent unnecessary tuple probing by filtering out the "un-matched" situation of the incoming packet. The experimental results show that combining the tuple reduction algorithm with look-ahead caching increases the lookup speed by a factor of six while requiring only around one third of the storage. Additionally, an extension of multiple fields to more general filters is addressed.  相似文献   

19.
Bloom filters are extensively used in distributed applications, especially in distributed databases and distributed information systems, to reduce network requirements and to increase performance. In this work, we propose two novel Bloom filter features that are important for distributed databases and information systems. First, we present a new approach to encode a Bloom filter such that its length can be adapted to the cardinality of the set it represents, with negligible overhead with respect to computation and false positive probability. The proposed encoding allows for significant network savings in distributed databases, as it enables the participating nodes to optimize the length of each Bloom filter before sending it over the network, for example, when executing Bloom joins. Second, we show how to estimate the number of distinct elements in a Bloom filter, for situations where the represented set is not materialized. These situations frequently arise in distributed databases, where estimating the cardinality of the represented sets is necessary for constructing an efficient query plan. The estimation is highly accurate and comes with tight probabilistic bounds. For both features we provide a thorough probabilistic analysis and extensive experimental evaluation which confirm the effectiveness of our approaches.  相似文献   

20.
本文提出一种基于多层次结构的树形布鲁姆过滤器TBF。多层次结构是近年来布鲁姆过滤器及相关数据结构研究的热点。这一结构使得多层次的存储方式得以实现,减轻了片上存储的负担,而且也加快了片上查找的速度。TBF是针对BloomingTree算法存在的缺陷所改进的一种更高效的算法,它能够在低于CBF的空间需求的条件下实现与CBF相同的功能。实验证明:与BloomingTree算法相比,TBF能够有效地解决BloomingTree算法在逻辑索引时的错误问题,而且比BloomingTree算法时间上更加高效:在层数不变假阳性相同条件下,查询时间平均提高13.4%;在假阳性不变层数相同条件下,插入时间平均提高17.9%,删除时间平均提高12%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号