首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
何婧  吴跃  杨帆  尹春雷  周维 《计算机应用》2014,34(11):3218-3221
针对云存储系统大多基于键值对模型存储数据,多维查询需要对整个数据集进行完全扫描,查询效率较低的问题,提出了一种基于KD树和R树的多维索引结构(简称KD-R索引)。KD-R索引采用双层索引模式,在全局服务器建立基于KD树的多维全局索引,在局部数据节点构建R树多维本地索引。基于性能损耗模型,选取索引代价较小的R树节点发布到全局KD树,从而优化多维查询性能。实验结果表明:与全局分布式R树索引相比,KD-R索引能够有效提高多维范围查询性能,并且在出现服务器节点失效的情况下,KD-R索引同样具有高可用性。  相似文献   

2.
王猛  张明 《现代计算机》2010,(5):78-80,93
基于内容图像检索越来越受到人们的关注,而多维索引技术是基于内容图像检索的关键技术之一.全面分析多维索引的相关技术,主要包括多维索引技术的发展现状、多维数据厦多维索引结构的特点、多维索引结构的查询方式,并且着重分析常见的六种有代表性的索引结构.  相似文献   

3.
黄维辉  熊翱 《软件》2013,(11):77-79
多维数据的处理已经成为影响很多领域发展的关键因素,特别是多维数据的相似性查询已经被用在很多领域中。当数据维度很大的时候,大多数索引结构处理的性能下降,这现象被称为“维度灾难”。针对多维度灾难,RAKDB-Tree是本文提出的一种高效处理多维数据的索引结构。该索引结构首先把数据空间划分为子空间,然后使用改进的KDB—Tree对子空间建立索引。RAKDB—Tree的查询、插入、删除等算法使得,索引结构一直保持较优状态。实验结果表明,RAKDB.Tree能够很好解决因为数据维度增加而带来的各种问题。  相似文献   

4.
基于数据仓库的OLAP系统是当前海量多维数据分析的主要工具。随着信息技术的发展,海量多维数据的规模急剧增长,结构日益复杂,OLAP系统的性能严重下降,已经无法满足人们的数据分析需求。基于分布式计算系统Hadoop给出了新的海量多维数据的存储方法和查询方法。设计了HDFS上的列存储文件格式HCFile,基于HCFile给出了海量多维数据存储方案,该方案能够提高聚集计算效率,并有很好的可扩展性。同时,利用多维数据的层次性语义特征,设计了维层次索引,并给出了利用维层次索引和MapReduce进行聚集计算的方法。通过和Hive的对比实验,表明了数据存储方案和查询方法能够有效提高海量多维数据分析的性能。  相似文献   

5.
多维索引技术是基于内容检索的图像数据库的关键技术。SR-tree和X-tree是目前比较成熟有效的多维索引技术。为了提高多维索引的性能,我们在分析SR-tree和X-tree的结构和性能的基础上,针对SR-tree分裂算法的不足,引入X-tree中超级节点的思想,通过改进插入和分裂算法,设计了一种新的多维索引结构ESR-tree,即ExtendedSR-tree。实验表明,随着数据量和维数的增多,ESR-tree的性能明显优于SR-tree和X-tree。  相似文献   

6.
云计算环境下支持复杂查询的多维数据索引机制   总被引:1,自引:0,他引:1  
针对云计算环境下分布式存储系统的数据索引不支持复杂查询的问题,提出了一种多维数据索引机制M-Index,采用金字塔技术(pyramid-technique)将数据的多维元数据描述成一维索引,在此基础上首次提出前缀二叉树(prefix binary tree,PBT)的概念,通过提取一维索引和PBT有效节点的前缀作为数据在存储系统中的主键.数据根据主键和一致性Hash机制发布到存储节点组成的覆盖网络.设计了基于M-Index的数据查询算法,将复杂查询请求转换成一维查询键值,有效支持多维查询和区间查询等复杂查询模式.理论分析和实验表明,M-Index在复杂查询模式下具有良好的查询效率和负载均衡.  相似文献   

7.
大数据作为新的战略资源,在信息领域发挥着重要作用。大数据的检索规模往往达到十亿甚至百亿级,导致传统的查询机制效率低下成为常态。因此,提高大数据的查询效率、降低查询负担成为大数据研究的重要方面。为 此提出了一种面向批量处理的大数据检索过滤模型IMFM,介绍了其核心思想及工作原理,论证了IMFM对于多维查询的支持,并给出了IMFM的部署策略。在大数据索引结构中的适当位置部署该模型,在检索请求通过节点时对检索请求进行快速过滤,避免无关请求对节点下方索引结构的操作,从而降低检索对性能的消耗。实验证明,在大数据批量处理环境下,该模型可以有效缩短大数据一维和多维查询的路径长度,提高检索效率,大幅减轻大数据存储和处理平台的负担。  相似文献   

8.
针对用户在大规模云对等网络环境下多维区间查询问题,将基于m叉平衡树的索引架构引入到云对等网络环境下,在该架构上实现集中式环境下支持多维数据索引的层次化树结构,例如R树,QR树等。多维区间查询算法保证查询从树的任意位置开始,避免了根节点引起的系统性能瓶颈问题。通过计算和实验验证,对于N个节点的网络,多维区间查询效率为O(logmN)(m>2)(m表示扇出),由此可见,查询效率和维数d无关,查询效率不会随着维数d的增加而降低。最后建立基于扇出m的代价模型,并且计算出了最优的m值。  相似文献   

9.
一种新的基于P2P系统的网格资源信息发现方法   总被引:1,自引:0,他引:1  
网格环境下,众多的资源中查找发现所需的资源是一个关键的问题.基于结构化的支持数据顺序索引的P2P系统提出了一种全新的网格环境下资源发现的方法,该方法将数据库领域先进的多维数据索引技术Pyramid引入到P2P系统之中.通过数据库的多维索引技术,使得P2P系统支持网格资源的多维范围查询.该算法采用了对称结构的金字塔技术,使得网格资源管理动态属性变化的维护代价方面具有很好的性能..理论证明,当维度较大时,由于属性动态性导致的维护代价与维度成反比,而与属性的变化范围无关.另外对P2P的负载均衡策略进行了相应的考虑.最后,对系统的路由性能以及范围查询的有效性进行了仿真验证.  相似文献   

10.
一种支持多维数据范围查询的对等计算索引框架   总被引:1,自引:0,他引:1  
如何有效地支持多维数据范围查询是传统数据管理领域的研究热点之一.但是,在大规模分布式系统中,这仍然是一个具有挑战性的研究工作.VBI-tree是一个对等计算环境下基于平衡树的索引架构,在该架构上可以实现集中式环境下的多种支持多维数据索引的层次化树结构,例如R-tree,X-tree和M-tree等.VBI-tree设计的查询算法保证查询可以从树的任意位置开始,而不是像集中式环境下层次化树结构那样采用从树的根节点开始查询的方法,从而成功地避免了根节点引起的系统性能瓶颈问题.对于有N个节点的网络,索引方法可以保证查询效率是O(log N).VBI-tree提出了基于AVL-tree旋转的网络重构负载均衡策略可以有效地均衡负栽.另外,在数据操作频繁的情况下,为了提高索引的性能,在VBI-tree上建立特殊的祖先-子孙链接形成VBI-tree的结构.通过使用祖先-子孙链接,可保证对于相关查询区域的探索尽量发生在同层节点之间,而不是一直往根节点方向发送,从而减轻上层节点的查询负担,并且显著地降低了更新代价.模拟实验验证了提出的方法的有效性.  相似文献   

11.
Digitization has created an abundance of new information sources by altering how pictures are captured. Accessing large image databases from a web portal requires an opted indexing structure instead of reducing the contents of different kinds of databases for quick processing. This approach paves a path toward the increase of efficient image retrieval techniques and numerous research in image indexing involving large image datasets. Image retrieval usually encounters difficulties like a) merging the diverse representations of images and their Indexing, b) the low-level visual characters and semantic characters associated with an image are indirectly proportional, and c) noisy and less accurate extraction of image information (semantic and predicted attributes). This work clearly focuses and takes the base of reverse engineering and de-normalizing concept by evaluating how data can be stored effectively. Thus, retrieval becomes straightforward and rapid. This research also deals with deep root indexing with a multi-dimensional approach about how images can be indexed and provides improved results in terms of good performance in query processing and the reduction of maintenance and storage cost. We focus on the schema design on a non-clustered index solution, especially cover queries. This schema provides a filter predication to make an index with a particular content of rows and an index table called filtered indexing. Finally, we include non-key columns in addition to the key columns. Experiments on two image data sets ‘with and without’ filtered indexing show low query cost. We compare efficiency as regards accuracy in mean average precision to measure the accuracy of retrieval with the developed coherent semantic indexing. The results show that retrieval by using deep root indexing is simple and fast.  相似文献   

12.
To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding caches to GPGPUs, however, does not always guarantee improved performance and energy efficiency due to the thrashing in small caches shared by thousands of threads. While prior work has proposed warp-scheduling and cache-bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing (ACI).To bridge this gap, this work investigates the effectiveness of ACI for high-performance and energy-efficient GPGPU computing. We discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the ACI schemes based on a cycle-accurate GPGPU simulator. Our quantitative evaluation demonstrates that the ACI schemes are effective in that they provide significant performance and energy-efficiency gains over the conventional indexing scheme. Further, we investigate the performance sensitivity of ACI to key architectural parameters (e.g., indexing latency and cache associativity). Our experimental results show that the ACI schemes are promising in that they continue to provide significant performance gains even when additional indexing latency occurs due to the hardware complexity and the baseline cache is enhanced with high associativity or large capacity.  相似文献   

13.
Wireless Data Broadcasting is a newly developed data dissemination method for spreading public information to a tremendous number of mobile subscribers. Access Latency and Tuning Time are two main criteria to evaluate the performance of such system. With the help of indexing technology, clients can reduce tuning time significantly by searching indices first and turning to doze mode during waiting period. Different indexing schemes perform differently, so we can hardly compare the efficiency of different indexing schemes. In this paper, we redesigned several most popular indexing schemes for data broadcasting systems, i.e., distributed index, exponential index, hash table, and Huffman tree index. We created a unified communication model and constructed a novel evaluation strategy by using the probability theory to formulate the performance of each scheme theoretically and then conducted simulations to compare their performance by numerical experiments. This is the first work to provide scalable communication environment and accurate evaluation strategies. Our communication model can easily be modified to meet specific requirements. Our comparison model can be used by the service providers to evaluate other indexing schemes to choose the best one for their systems.  相似文献   

14.
Data broadcast is an attractive data dissemination method in mobile environments. To improve energy efficiency, existing air indexing schemes for data broadcast have focused on reducing tuning time only, i.e., the duration that a mobile client stays active in data accesses. On the other hand, existing broadcast scheduling schemes have aimed at reducing access latency through nonflat data broadcast to improve responsiveness only. Not much work has addressed the energy efficiency and responsiveness issues concurrently. This paper proposes an energy-efficient indexing scheme called MHash that optimizes tuning time and access latency in an integrated fashion. MHash reduces tuning time by means of hash-based indexing and enables nonflat data broadcast to reduce access latency. The design of hash function and the optimization of bandwidth allocation are investigated in depth to refine MHash. Experimental results show that, under skewed access distribution, MHash outperforms state-of-the-art air indexing schemes and achieves access latency close to optimal broadcast scheduling.  相似文献   

15.
Broadcasting is an effective means of disseminating information in a wireless environment to a large number of clients with powerful palmtops. However, it requires the clients to be actively listening to the communication channels for the desired information. Because of the high power consumption of the active mode, it is crucial for the battery-operated palmtops to conserve their energy in order to extend their effective battery life. This calls for selective tuning mechanisms that allow the clients to operate in the less energy-consuming doze mode, and to operate only in active mode when the desirable portion of the information is broadcast. Most of the existing work focuses on uniform broadcast. In practice, only a small amount of information is highly in demand by a large number of clients while the remainder is less popular. This nonuniform access pattern poses several new issues. In this paper, we examine these issues and look at how a nonuniform broadcast can be organized for selective tuning by the clients. We describe several indexing schemes to facilitate selective tuning which are variations of existing techniques on uniform broadcast. We analyze the performance of the schemes based on the average tuning time and average access time.  相似文献   

16.
LIGHT: A Query-Efficient Yet Low-Maintenance Indexing Scheme over DHTs   总被引:1,自引:0,他引:1  
DHT is a widely used building block for scalable P2P systems. However, as uniform hashing employed in DHTs destroys data locality, it is not a trivial task to support complex queries (e.g., range queries and k-nearest-neighbor queries) in DHT-based P2P systems. In order to support efficient processing of such complex queries, a popular solution is to build indexes on top of the DHT. Unfortunately, existing over-DHT indexing schemes suffer from either query inefficiency or high maintenance cost. In this paper, we propose LIGhtweight Hash Tree (LIGHT)—a query-efficient yet low-maintenance indexing scheme. LIGHT employs a novel naming mechanism and a tree summarization strategy for graceful distribution of its index structure. We show through analysis that it can support various complex queries with near-optimal performance. Extensive experimental results also demonstrate that, compared with state of the art over-DHT indexing schemes, LIGHT saves 50-75 percent of index maintenance cost and substantially improves query performance in terms of both response time and bandwidth consumption. In addition, LIGHT is designed over generic DHTs and hence can be easily implemented and deployed in any DHT-based P2P system.  相似文献   

17.
基于矢量量化的快速图像检索   总被引:7,自引:0,他引:7  
叶航军  徐光祐 《软件学报》2004,15(5):712-719
传统索引方法对高维数据存在"维数灾难"的困难.而对数据分布的精确描述及对数据空间的有效划分是高维索引机制中的关键问题.提出一种基于矢量量化的索引方法.该方法使用高斯混合模型描述数据的整体分布,并训练优化的矢量量化器划分数据空间.高斯混合模型能更好地描述真实图像库的数据分布;而矢量量化的划分方法可以充分利用维之间的统计相关性,能够对数据向量构造出更加精确的近似表示,从而提高索引结构的过滤效率并减少需要访问的数据向量.在大容量真实图像库上的实验表明,该方法显著减少了支配检索时间的I/O开销,提高了索引性能.  相似文献   

18.
在具有超级结点的非结构化P2P系统中,研究了复杂多维数据的查询搜索策略,提出了一个应用于具有超级结点的非结构化P2P网络的综合框架,在该框架中,能够实现对多维数据共享、索引以及查询等操作的处理。以R^*-tree索引树为基础,提出了一种能够应用于P2P的扩展R^*-tree索引树,即EIR-tree树,研究了系统中集群信息的收集与维护、EIR-tree树的构建与维护等方法和措施。  相似文献   

19.
Scientific datasets are often stored on distributed archival storage systems, because geographically distributed sensor devices store the datasets in their local machines and also because the size of scientific datasets demands large amount of disk space. Multidimensional indexing techniques have been shown to greatly improve range query performance into large scientific datasets. In this paper, we discuss several ways of distributing a multidimensional index in order to speed up access to large distributed scientific datasets. This paper compares the designs, challenges, and problems for distributed multidimensional indexing schemes, and provides a comprehensive performance study of distributed indexing to provide guidelines to choose a distributed multidimensional index for a specific data analysis application.  相似文献   

20.
Dominant features for the content-based image retrieval usually have high-dimensionality. So far, many researches have been done to index such values to support fast retrieval. Still, many existing indexing schemes are suffering from performance degradation due to the curse of dimensionality problem. As an alternative, heuristic algorithms have been proposed to calculate the answer with ??high probability?? at the cost of accuracy. In this paper, we propose a new hash tree-based indexing structure called tertiary hash tree for indexing high-dimensional feature data. Tertiary hash tree provides several advantages compared to the traditional extendible hash structure in terms of resource usage and search performance. Through extensive experiments, we show that our proposed index structure achieves outstanding performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号