共查询到19条相似文献,搜索用时 265 毫秒
1.
提出一种在椭圆体聚类上进行主分量排序的高维索引方法, 线性访问较少的数据点就可完成k近邻搜索过程。该方法对数据集进行椭圆体聚类划分,在KL变换域上建立近似向量。在k近邻搜索过程中,采用部分失真搜索算法,按照距离下界由小到大的顺序依次搜索各个椭圆体聚类。在大型高维图像特征库上的实验表明,与其他向量近似方法相比,该索引结构降低近似向量的访问数量,能够较显著提高k近邻搜索速度。 相似文献
2.
3.
在数据量与数据复杂度不断增加的时代,大数据处理与分析成为当前的热门研究内容,高维空间数据的使用越来越频繁,数据检索和访问速度成了衡量数据处理系统性能的重要指标.因此,如何设计实现一种高效的高维索引结构,提高查询访问速率、降低内存占用,变得至关重要.近年, Kraska等人提出了学习型索引的方法.实验证明该方法在真实数据集上表现良好.之后机器学习与深度学习在数据库系统中的运用越来越广泛.众多研究者尝试在高维数据上构建学习型索引,来提升高维数据的查询速度.但是目前的高维学习型索引采用的方法并不能将数据分布的信息有效利用起来,而且过于复杂的深度学习模型使得索引初始化开销过大.结合空间区域划分与降维两种技术,提出一种新颖的高维学习型索引.它能更有效地利用数据分布信息提高索引的查询效率,并利用多段线性模型在保证查找精确度的前提下尽可能减少索引初始化的开销.分别在随机生成的数据集和开源街区地图数据集上进行实验验证.结果表明,与现有的高维索引相比,其在索引构建、查询效率、以及内存占用方面都有显著提高. 相似文献
4.
文中提出一种支持概率k近邻查询的不确定高维索引结构--ISU-Tree.在高维空间,首先对n个不确定数据对象进行k平均聚类,然后分别对每个不确定超球进行初始"切片",并对其进行多特征编码得到对应的统一化索引键值,并且用B+树建立索引.这样,高维空间的概率查询就转变成对一维空间的启发式的范围查询及求精运算.理论及实验分析表明ISU-Tree索引能更有效地缩小搜索空间,减少积分计算的代价.在查询效率方面要明显优于其它的索引方法,尤其适合海量高维不确定数据的概率查询. 相似文献
5.
提出一种在网格环境下的k近邻查询方法——GkNN.到目前为止,尚未有文献提出数据网格环境下的k近邻查询算法.当用户在查询节点提交一个查询向量和k,首先以一个较小的查询半径。在数据节点进行基于双重距离尺度的向量缩减,然后将缩减后的向量按照向量“打包”传输的方式发送到执行节点,在执行节点并行地对这些候选向量进行距离(求精)运算.最终将结果向量返回到查询节点.当返回的向量个数小于k时,扩大半径值,继续循环直到得到k个最近邻向量为止.理论分析和实验证明该方法在减少网络通信开销、增加I/O和CPU并行、降低-向应时间方面具有较好的性能,非常适合海量高维数据的查询. 相似文献
6.
高维索引作为基于内容检索和模式识别等领域的一项关键技术,其性能直接影响整个系统的查询速度和准确率,但高维情况下的 “维度灾难”一直制约着相应检索性能的提高。通过分析小世界模型,提出了完整的逐跳逼近索引算法,该算法仅维护点与点在度量空间上的局部邻近关系,通过将查询过程的“关注点”逐步往查询命中区域跳跃逼近来实现高维空间数据点间的范围查询和近似近邻查询。实验证明该方法在不依赖索引数据的先验分布情况下能有效地处理高维数据向量的检索,且具有良好的可维护性与拓展性。 相似文献
7.
8.
遥感高光谱数据是一种具有空间聚集特性的高维数据。对PT方法进行改进使之与iDistance的索引机制相适应,并融合这两种不同的空间划分策略,提出一种适用于高光谱数据的索引结构。该索引是一种度量空间的高维索引,采用两级空间划分,在处理光谱相似性查询时可同时完成针对距离和空间方位的数据过滤。实验证明该索引可以有效降低I/O和距离计算次数,具有较高的剪枝效率,适用于高光谱数据相似性查询。 相似文献
9.
10.
真实世界中,常存在很多障碍物,影响空间对象到查询点的可见性及距离,可见k近邻查询查找距查询点最近的k个可见对象,是时空查询领域的一类重要算法.由于度量设备误差以及通信开销的限制等因素,空间对象位置不确定因素广泛存在.文中拟对不确定对象执行可见k近邻查询,提出了概率可见k近邻(PVkNN)查询,即查找前k个成为查询点最近邻居概率最大的节点.为了高效地执行这一查询,文中提出了k-界限剪枝方法,基于可见质心的紧缩过滤以及对不可见对象的剪枝策略,从空间角度过滤掉不符合条件的对象.为避免对候选集合中每个对象的概率都进行精确计算,从概率角度提出了根据概率上下限来对候选集合进行进一步的求精方法,采用近似采样技术来获取可见区域的比例,实现了对PVkNN的高效计算.采用真实和模拟数据集设计实验,充分验证了算法的效率和精度. 相似文献
11.
The vector approximation file (VA-file) approach is an efficient high-dimensional indexing method for image retrieval in large database. Some extensions of VA-file have been proposed towards better query performance. However, all of these methods applying sequential scan need read the whole vector approximation file. In this paper, we present a new indexing structure based on vector approximation method, in which only a small part of approximation file need be accessed. First, principal component analysis is used to map multidimensional points to a 1D line. Then a B+-tree is built to index the approximate vector according to principal component. When performing k-nearest neighbor search, the partial distortion searching algorithm is used to reject the improper approximate vectors. Only a small set of approximate vectors need to be sequentially scanned during the search, which can reduce the CPU cost and I/O cost dramatically. Experiment results on large image databases show that the new approach provides a faster search speed than the other VA-file approaches. 相似文献
12.
Ben Shneiderman 《International journal of parallel programming》1974,3(1):91-101
Multilevel indexes have long been used for accessing records in sorted files. Given the access cost at each level, the total cost of retrieving a record from the file can be substantially reduced by selecting the proper size of the index at each level. Organizations involving a variable number of levels are covered and binary searching is compared to sequential searching. 相似文献
13.
An evaluation of vertical class partitioning for query processing in object-oriented databases 总被引:1,自引:0,他引:1
Chi-wai Fung Karlapalem K. Qing Li 《Knowledge and Data Engineering, IEEE Transactions on》2002,14(5):1095-1118
Vertical partitioning is a design technique for reducing the number of disk accesses to execute a given set of queries by minimizing the number of irrelevant instance variables accessed. This is accomplished by grouping the frequently accessed instance variables as vertical class fragments. The complexity of object-oriented database models due to subclass hierarchy and class composition hierarchy complicates the definition and representation of vertical partitioning of the classes, which makes the problem of vertical partitioning in OODBs very challenging. In this paper, we develop a comprehensive analytical cost model for processing of queries on vertically partitioned OODB classes. A set of analytical evaluation results is presented to show the effect of vertical partitioning, and to study the trade-off between the projection ratio versus selectivity factor vis-a-vis sequential versus index access. Furthermore, an empirical experimental prototype supporting vertical class partitioning has been implemented on a commercial OODB tool kit to validate our analytical cost model. 相似文献
14.
15.
In order to improve the performance and efficiency of truss structure optimization, this paper presents a general framework that embeds and seamlessly integrates commercial CAD and CAE software through common programming languages and application programming interface (API). Along with the automatic CAD/CAE integration, an adaptive metamodel-based optimization called sequential radial basis function (SRBF) is applied to truss structure optimization involving sizing, geometry and topology variables. SRBF distinguishingly features two-loops searching strategy, the “inner loop” and the “outer loop”. The “inner loop” aims to search a feasible point through updating the factors of the augmented Lagrangian function. With the improved significant sampling space (ISSS) method, the “outer loop” sequentially generates new additional samples to update the RBF model. The continuous relaxation method is developed to deal with the mixed-discrete variables during the truss structure optimization. Applied to practical truss structure optimization problems from small scale to large scale, the proposed framework demonstrates feasibility of the CAD/CAE integration system during the structure modeling and analysis, and facilitates the truss structure optimization process. The comparison results between the SRBF and other approaches show that SRBF improves merit of searching global optimum and reduces the computation cost. 相似文献
16.
本文给出一种以词语为索引项的索引文件存储结构,以及基于这种结构的索引查询算法.首先分析中文索引库的分布规律,接着在此基础上设计了一种逆序存储的三层索引结构,这种结构在创建索引时能根据词语频率自动调整存储顺序,最后给出一种基于自动机和逆向最大匹配的索引查询算法.实验系统TIFS将三层索引结构与B树、哈希方法在时间和空间复杂度方面进行对比,结果表明,对于大规模的中文文本检索,三层索引结构的综合效果最好. 相似文献
17.
传统的多模式匹配算法是用树型结构的有限自动机实现的 ,它具有很多缺点 .本文提出的多模式匹配算法是基于有序二叉树的多模式匹配算法 .实验证明 ,本文算法不但具有和传统算法相当的查找速度 ,而且构造速度快、内存耗费少 .因此 ,本文提出的算法特别适用于要求动态构造自动机的情况 相似文献
18.
在大规模多媒体数据库中进行基于内容的检索,高维数据牵引结构的研究是重要问题,提出了一种有效的高维索引结构-自适应近似树,阐述了它的结构,给出了构建和检索算法,它结合了树结构和顺序检索的共同优点,针对不同的数据分布情况可以自适应地调整结构,维数较低或数据分布偏斜较大时它呈现树的结构,高维或数据分布密集时呈现顺序扫描的结构,以达到更优的检索效率,在结构上,对MBR使用了压缩存储的方法以节省存储空间,在算法中充分利用了空间划分是MBS和MBR共存的特点,减少了大量复杂的计算,从而大大提高检索效率。 相似文献
19.
This article deals with the group interview problem, in which each group contains several alternatives and each group of alternatives is presented and evaluated sequentially over time. We derive the optimal selection strategy for the group interview problem with a general utility function. Among the various types of utility function, we focus on the best choice problem, in which our utility is one if we successfully select the best choice and zero otherwise. We derive a simple selection rule called the optimal partitioning strategy in which the decision-maker divides the entire groups into two disjoint sets and, after evaluating the choices in the first set, chooses the relatively best choice available for the first time in the second set. Because the selected choice is not necessarily the absolutely best choice, we also consider the probability distribution of the actual rank of the choice selected under the partitioning strategy.