首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Range nearest-neighbor query   总被引:6,自引:0,他引:6  
A range nearest-neighbor (RNN) query retrieves the nearest neighbor (NN) for every point in a range. It is a natural generalization of point and continuous nearest-neighbor queries and has many applications. In this paper, we consider the ranges as (hyper)rectangles and propose efficient in-memory processing and secondary memory pruning techniques for RNN queries in both 2D and high-dimensional spaces. These techniques are generalized for kRNN queries, which return the k nearest neighbors for every point in the range. In addition, we devise an auxiliary solution-based index EXO-tree to speed up any type of NN query. EXO-tree is orthogonal to any existing NN processing algorithm and, thus, can be transparently integrated. An extensive empirical study was conducted to evaluate the CPU and I/O performance of these techniques, and the study showed that they are efficient and robust under various data sets, query ranges, numbers of nearest neighbors, dimensions, and cache sizes.  相似文献   

2.
本文针对大规模高维数据近邻检索中的瓶颈问题,提出基于向量量化的一种检索方法—簇内乘积量化树方法.该方法运用向量量化和乘积量化的多层树状结构高效表征大规模高维数据集,与现有方法相比降低了索引表空桶率;其次提出基于贪心队列的近邻簇筛选方法减小了计算复杂度,加快了近邻检索速度;最后提出面量化方法用于近似计算候选数据集向量与查询向量间的距离,与点量化和线量化方法相比量化误差更小,提高了近邻查询准确率.本文提出的簇内乘积量化树算法在算子Sift和Gist描述的大规模高维数据集上与乘积量化树技术相比,首次召回准确率提高了57.7%,索引表空桶率降低幅度在50%以上,与局部优化乘积量化技术相比,查全率高达97%,而查询时间却仅需原来的1/9.实验结果表明本文提出的基于簇内乘积量化的近邻方法提升了近邻检索性能,为大规模高维数据集近邻检索提供了理论支持.  相似文献   

3.
Rav-tree:一种有效支持反向近似近邻查询的索引结构   总被引:1,自引:0,他引:1  
空间数据库的索引结构是实现有效数据查询的前提和基础。空间数据反向近似近邻查询是空间查询的一个新方向,它避免了精确查询中过多的距离计算,从而能够在效率与准确性上取得平衡。提出的Rav-tree不同于基于启发式规则的索引结构,首先利用局部近似,然后根据Voronoi cell区域和估计圆的方法实现近似近邻查询,并利用过滤结果和分域查询得到初步的候选集,最终通过反向近似近邻查询(RANNQuery)算法得到RANN集,并完整地给出基于Rav-tree的ANN查询算法和RANN查询算法。实验结果表明,Rav-tree对RANN等查询具有较好的查询效率和查全率。  相似文献   

4.
The paper proposes a novel symmetrical encoding-based index structure, which is called EDD-tree (for encoding-based dual distance tree), to support fast k-nearest neighbor (k-NN) search in high-dimensional spaces. In the EDD-tree, all data points are first grouped into clusters by a k-means clustering algorithm. Then the uniform ID number of each data point is obtained by a dual-distance-driven encoding scheme, in which each cluster sphere is partitioned twice according to the dual distances of start- and centroid-distance. Finally, the uniform ID number and the centroid-distance of each data point are combined to get a uniform index key, the latter is then indexed through a partition-based B^+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces can be transformed into search in a single dimensional space with the aid of the EDD-tree index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of our proposed scheme, and the results demonstrate that this method outperforms the state-of-the-art high-dimensional search techniques such as the X-tree, VA-file, iDistance and NB-tree, especially when the query radius is not very large.  相似文献   

5.
K近邻查询是空间数据库中的重要查询之一,k近邻查询在内容的相似性检索、模式识别、地理信息系统中有重要应用。针对现有k近邻查询都是基于点查询的情况,提出基于平面线段的k近邻查询,查找线段集中给定查询点的k个最近线段。给出基于Voronoi图的线段k近邻查询算法及给出相关定理和证明。该算法通过线段Voronoi图的邻接特性找到一个候选集,然后从中找到最终结果。通过随机数据的实验证明,所提算法明显优于线性扫描算法和基于R树的k近邻查询算法。  相似文献   

6.
王淼  郝忠孝 《计算机工程》2010,36(10):47-49
多数不确定性对象的反向近邻查询不能明确回答某个不确定性对象是否为查询对象的反向最近邻,针对该问题,提出概率反向最近邻查询的概念,设计不确定性对象的概率反向最近邻查询的索引结构,给出一种基于该结构的不确定性对象的反向最近邻查询算法。  相似文献   

7.
詹芹 《计算机工程》2010,36(10):50-51
针对如何有效地利用大量的原始数据分析现状来预测未来的问题,基于抗体选择策略提出一种克隆选择挖掘算法。通过评估抗体的支持度、可信度和亲和度,求得有效的关联规则。实验结果表明,该算法能较快地获得可理解的规则,并且具有较高的准确率。  相似文献   

8.
杨泽雪  郝忠孝 《计算机工程》2014,(1):272-274,279
为解决动态环境中移动点的连续反向最近邻查询问题,将连续反向最近邻查询分为单色和双色2种情况进行研究。利用移动点Voronoi图,分别给出单色连续反向最近邻查询算法、双色连续反向最近邻查询算法以及相关定理,对算法正确性和可终止性进行证明,分析算法时间复杂性。按照移动点Voronoi图的拓扑结构是否改变分为2种情况,分析每种情况下候选所在区域的变化,在变化区域内进行Voronoi图的重构,得到对应的解决方法。在多数情况下,该算法只需生成局部移动点的Voronoi图即可找到结果,减小了连续反向最近邻查询的代价。  相似文献   

9.
Reverse nearest neighbor (RNN) queries have a broad application base such as decision support, profile-based marketing, resource allocation, etc. Previous work on RNN search does not take obstacles into consideration. In the real world, however, there are many physical obstacles (e.g., buildings) and their presence may affect the visibility between objects. In this paper, we introduce a novel variant of RNN queries, namely, visible reverse nearest neighbor (VRNN) search, which considers the impact of obstacles on the visibility of objects. Given a data set P, an obstacle set O, and a query point q in a 2D space, a VRNN query retrieves the points in P that have q as their visible nearest neighbor. We propose an efficient algorithm for VRNN query processing, assuming that P and O are indexed by R-trees. Our techniques do not require any preprocessing and employ half-plane property and visibility check to prune the search space. In addition, we extend our solution to several variations of VRNN queries, including: 1) visible reverse k-nearest neighbor (VRkNN) search, which finds the points in P that have q as one of their k visible nearest neighbors; 2) delta-VRkNN search, which handles VRkNN retrieval with the maximum visible distance delta constraint; and 3) constrained VRkNN (CVRkNN) search, which tackles the VRkNN query with region constraint. Extensive experiments on both real and synthetic data sets have been conducted to demonstrate the efficiency and effectiveness of our proposed algorithms under various experimental settings.  相似文献   

10.
A recurring problem in 3D applications is nearest-neighbor lookups in 3D point clouds. In this work, a novel method for exact and approximate 3D nearest-neighbor lookups is proposed that allows lookup times that are, contrary to previous approaches, nearly independent of the distribution of data and query points, allowing to use the method in real-time scenarios. The lookup times of the proposed method outperform prior art sometimes by several orders of magnitude. This speedup is bought at the price of increased costs for creating the indexing structure, which, however, can typically be done in an offline phase. Additionally, an approximate variant of the method is proposed that significantly reduces the time required for data structure creation and further improves lookup times, outperforming all other methods and yielding almost constant lookup times. The method is based on a recursive spatial subdivision using an octree that uses the underlying Voronoi tessellation as splitting criteria, thus avoiding potentially expensive backtracking. The resulting octree is represented implicitly using a hash table, which allows finding the leaf node a query point belongs to with a runtime that is logarithmic in the tree depth. The method is also trivially extendable to 2D nearest neighbor lookups.  相似文献   

11.
张丽平  李松  麻琳  唐远新  郝晓红 《计算机应用》2014,34(12):3470-3474
针对构建Voronoi图的方法的生成效率较低,构建复杂度较高的问题,提出了利用多方法交叉融合进行Voronoi图的构建与更新的方法。为了提高空间数据最近邻查询的效率,提出了基于Voronoi图和Voronoi多边形最小内切圆的最近邻查询方法;针对查询点位置频繁变化的情况,提出了基于Voronoi图和Voronoi多边形最小外接矩形的最近邻查询方法;为了提高对偶近邻对和最近对的查询效率,利用Voronoi多边形和对应的最小内切圆进行过滤和查询,提出了统一查询对偶近邻对和最近对的新方法。实验结果表明,所提方法解决了因数据分布不均导致的额外计算量的开销问题,在数据集规模较大和查询频率较高时具有一定的优势。  相似文献   

12.
A reverse k-nearest neighbor (RkNN) query retrieves the data points which regard the query point as one of their respective k nearest neighbors. A bi-chromatic reverse k-nearest neighbor (BRkNN) query is a variant of the RkNN query, considering two types of data. Given two types of data G and C, a BRkNN query regarding a data point q in G retrieves the data points from C that regard q as one of their respective k-nearest neighbors among the data points in G. Many existing approaches answer either the RkNN query or the BRkNN query. Different from these approaches, in this paper, we make the first attempt to propose a top-n query based on the concept of BRkNN queries, which ranks the data points in G and retrieves the top-n points according to the cardinalities of the corresponding BRkNN answer sets. For efficiently answering this top-n query, we construct the Voronoi Diagram of G to index the data points in G and C. From the information associated with the Voronoi Diagram of G, the upper bound of the cardinality of the BRkNN answer sets for each data point in G can be quickly computed. Moreover, based on an existing approach to answering the RkNN query and the characteristics of the Voronoi Diagram of G, we propose a method to find the candidate region regarding a BRkNN query, which tightens the corresponding search space. Finally, based on the triangle inequality, we propose an efficient refinement algorithm for finding the exact BRkNN answers from the candidate regions. To evaluate our approach on answering the top-n query, it is compared with an approach which applies a state-of-the-art algorithm for answering the BRkNN query to each data point in G. The experiment results reveal that our approach has a much better performance.  相似文献   

13.
张丽平  经海东  李松  崔环宇 《计算机科学》2016,43(5):174-178, 187
为了提升障碍空间中k最近邻查询的效率,研究了障碍空间中基于Voronoi图的k最近邻查询方法,提出了在障碍空间基于Voronoi图的kNN-Obs算法。该算法采用了两个过程:过滤过程和精炼过程。过滤过程主要是利用Voronoi图的过滤功能,较大程度地减少了被查询点的个数。精炼过程主要根据障碍距离和邻接生成点对候选集内对象进行第二次筛选。进一步给出了处理新增加点的ADDkNN-Obs算法和处理删除点的DENkNN-Obs算法。实验表明该算法在处理障碍空间中的k最近邻问题时具有优势。  相似文献   

14.
最近邻查询是地理信息系统领域经常遇到的问题,而反最近邻查询是在最近邻查询的基础上提出的一种新的查询类型。在分析利用Voronoi图进行最近邻查询的基础上,提出了基于Voronoi图及其对偶图Delaunay图的反最近邻查询,大大缩小了在海量空间数据库中进行反最近邻查询的查询范围。  相似文献   

15.
针对线性降维技术应用于具有非线性结构的数据时无法得到令人满意的结果的问题,提出一种新的着重于保持高维空间局部最近邻信息的非线性随机降维算法(NNSE)。该算法首先在高维空间中通过计算样本点之间的欧氏距离找出每个样本点的最近邻点,接着在低维空间中产生一个随机的初始分布;然后通过将低维空间中的样本点不断向其最近邻点的平均位置移动,直到产生稳定的低维嵌入结果。与一种先进的非线性随机降维算法——t分布随机邻域嵌入(t-SNE)相比,NNSE算法得到的低维结果在可视化方面与t-SNE算法相差不大,但通过比较两者的量化指标可以发现,NNSE算法在保持最近邻信息方面上明显优于t-SNE算法。  相似文献   

16.
由于已有的最近邻查询方法无法直接处理受限区域内的单纯型连续近邻链查询问题,针对受限区域和障碍物的复杂性,详细研究了受限区域内无障碍物和有障碍物环境下的单纯型连续近邻链查询方法,分别提出了VOR_NB_CRSCNNC算法和VOR_CB_CRSCNNC算法。算法基于计算几何中的Voronoi图和判定圆域对空间数据对象进行预先筛选和计算,每次查询仅需考虑落在数量较少的Voronoi多边形和判定圆域内的数据点,预先过滤掉大量数据,减少每次计算涉及的数据量。理论研究和实验分析表明,所提出的算法在查询过程中减少了数据逐一判断的冗余计算,受受限区域形状的影响较小,较大程度提高了查询效率。  相似文献   

17.
Traditional nearest-neighbor (NN) search is based on two basic indexing approaches: object-based indexing and solution-based indexing. The former is constructed based on the locations of data objects: using some distance heuristics on object locations. The latter is built on a precomputed solution space. Thus, NN queries can be reduced to and processed as simple point queries in this solution space. Both approaches exhibit some disadvantages, especially when employed for wireless data broadcast in mobile computing environments. In this paper, we introduce a new index method, called the grid-partition index, to support NN search in both on-demand access and periodic broadcast modes of mobile computing. The grid-partition index is constructed based on the Voronoi diagram, i.e., the solution space of NN queries. However, it has two distinctive characteristics. First, it divides the solution space into grid cells such that a query point can be efficiently mapped into a grid cell around which the nearest object is located. This significantly reduces the search space. Second, the grid-partition index stores the objects that are potential NNs of any query falling within the cell. The storage of objects, instead of the Voronoi cells, makes the grid-partition index a hybrid of the solution-based and object-based approaches. As a result, it achieves a much more compact representation than the pure solution-based approach and avoids backtracked traversals required in the typical object-based approach, thus realizing the advantages of both approaches. We develop an incremental construction algorithm to address the issue of object update. In addition, we present a cost model to approximate the search cost of different grid partitioning schemes. The performances of the grid-partition index and existing indexes are evaluated using both synthetic and real data. The results show that, overall, the grid-partition index significantly outperforms object-based indexes and solution-based indexes. Furthermore, we extend the grid-partition index to support continuous-nearest-neighbor search. Both algorithms and experimental results are presented. Edited by R. Guting  相似文献   

18.
We consider the problem of finding similar patterns in a time sequence. Typical applications of this problem involve large databases consisting of long time sequences of different lengths. Current time sequence search techniques work well for queries of a prespecified length, but not for arbitrary length queries. We propose a novel indexing technique that works well for arbitrary length queries. The proposed technique stores index structures at different resolutions for a given data set. We prove that this index structure is superior to existing index structures that use a single resolution. We propose a range query and nearest neighbor query technique on this index structure and prove the optimality of our index structure for these search techniques. The experimental results show that our method is 4 to 20 times faster than the current techniques, including sequential scan, for range queries and 3 times faster than sequential scan and other techniques for nearest neighbor queries. Because of the need to store information at multiple resolution levels, the storage requirement of our method could potentially be large. In the second part, we show how the index information can be compressed with minimal information loss. According to our experimental results, even after compressing the size of the index to one fifth, the total cost of our method is 3 to 15 times less than the current techniques.  相似文献   

19.
人们设计了许多索引以有效地处理高维空间中的近邻查询和区域查询。已经证明,维数较高时利用高维索引处理这两类查询几乎不可能比线性扫描快。提出了一种两层索引以自适应地识别数据集中的聚簇;数据集具有聚簇特性时,用该索引处理邻近查询和区域查询比现有的索引结构快;对其他数据集,利用该索引处理邻近查询和区域查询与线性扫描大致相当。该索引的上层结构将一些参考点组织成一棵二叉树,下层结构是一系列动态哈希表。数据集中的数据点根据它们到参考点的相对距离被哈希到相应的哈希桶中。查询处理时用查询点到参考点的距离进行剪除搜索。实验表明,提出的索引结构具有良好的性能。  相似文献   

20.
Reverse nearest-neighbor (RNN) query processing is important for many applications such as decision-support systems, profile-based marketing and molecular biology; consequently, RNN query processing has attracted considerable attention in the research community in recent years. Most existing approaches for RNN query processing either rely on nearest-neighbor pre-computation or work for specific data space (e.g., the Euclidean space). The only method for RNN query processing in metric space is based on the M-tree. In this paper, we propose an approach for RNN query processing in high-dimensional metric space using distance-based index structure (in particular, NAQ-tree that outperforms the other distance-based index structures as we have already verified in a previous study). In high-dimensional space, the properties of distance-based index structure provide strong pruning rules than the M-tree. In addition, unlike the previous work, our approach integrates the filtering and verification steps and uses the information obtained in the verification stage to further improve the filtering rate. Our approach delivers results incrementally and hence well serves real-time applications. The reported experimental results demonstrate the applicability and effectiveness of the proposed NAQ-tree-based RNN approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号