首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 187 毫秒
1.
针对Yannis Theodoridis等人提出的空间连接代价模型存在比较理想化的限制条件——假设数据均匀分布,缓冲策略使用简单的缺点,利用划分子空间并抽样获取非均匀数据实际密度的策略,提出了优先保存查询集合树的最新访问路径的有效中间节点的缓冲区算法,给出了改进后的评估公式。实验结果表明,改进后的模型比原模型提高了评估的精确度。  相似文献   

2.
针对Yannis Theodoridis等人提出的空间连接代价模型存在比较理想化的限制条件--假设数据均匀分布,缓冲策略使用简单的缺点,利用划分子空间并抽样获取非均匀数据实际密度的策略,提出了优先保存查询集合树的最新访问路径的有效中问节点的缓冲区算法,给出了改进后的评估公式.实验结果表明,改进后的模型比原模型提高了评估的精确度.  相似文献   

3.
宋杰  李甜甜  朱志良  鲍玉斌  于戈 《软件学报》2015,26(6):1438-1456
数据的指数级增长给数据管理和分析带来了严峻的挑战.连接查询是数据分析中一种常用运算,而MapReduce是一种用于大规模数据集并行处理的编程模型,研究基于MapReduce的连接查询代价评估和查询优化,有着学术意义和应用价值.MapReduce连接查询算法的性能主要取决于I/O代价(包括本地和网络I/O),而I/O代价与数据集以及连接运算的特征参数相关,通过对二元连接的I/O代价评估可以优化多元连接执行计划.基于此,首先提出了二元连接查询的I/O代价模型;随后,对现有二元连接算法进行形式化定义和简单扩展,归纳出6种基于MapReduce连接查询算法,并通过算法白盒分析定义它们的I/O代价函数;最后,提出一种多元连接最优执行计划的选择算法.通过实验表明I/O代价模型的正确性且能够准确地反映算法的性能优劣.  相似文献   

4.
空间信息处理和地理信息系统等领域的数据管理涉及到海量、高维空间数据对象的处理。本文针对传统数据索引结构在处理这类空间数据时所存在的内存使用过大、I/O消耗过多等问题,通过改进选择查询的代价模型,给出了基于PQR-tree的查询和代价模型,以提高空间数据查询的性能。提出了基于PQR-tree的三阶段并行查询的方法,分别在任务创建、分配、执行阶段进行优化。提出在任务创建和任务分配阶段应用于空间查询中过滤和精炼阶段的有效算法。测试表明,本文算法在处理各种不同分布类型数据集过程中有效降低了空间数据处理对时间和空间的代价和需求,并且并行机制下的代价模型在预测和评估方面也具有较好的精确度。  相似文献   

5.
分布式数据库中多元连接查询优化的研究   总被引:1,自引:0,他引:1  
论文对分布式数据库中多元连接查询操作次序的确定问题提出了优化,通过引入收益代价比的概念,提出了一基于贪心算法的选择模型。通过该模型,可以得到理想的连接次序的选取方案。  相似文献   

6.
空间数据库中连接运算的处理与优化   总被引:7,自引:0,他引:7       下载免费PDF全文
空间数据库的性能问题严重制约了它的应用与发展 .由于空间连接运算是空间数据库中最复杂、最耗时的基本操作 ,因此其处理效率在很大程度上决定了空间数据库的整体性能 .尽管目前已经有许多空间连接算法 ,但空间连接运算的代价估计和查询优化仍然有待进一步研究 .众所周知 ,大部分空间连接算法都是基于 R树索引实现的 ,如果参与空间连接运算的关系上没有索引或只有部分索引 ,那么就需要使用特殊的算法来处理 .另外 ,各种算法的代价评估模型需要一个相对统一的计算方法 ,实践证明 ,根据空间数据库的实际情况 ,使用 I/ O代价来估计算法的复杂性较为合理 .在此基础上 ,针对复杂的空间查询中可能出现多个关系参与空间连接运算的情况 ,故还需要合理地应用动态编程算法来找出代价最优的连接顺序 ,以便最终形成一个通用的算法框架 .通过对该算法框架的复杂性分析可以看出 ,在此基础上实现的空间数据库查询优化系统将具有较高的时空效率 ,并且能够处理非常复杂的空间查询  相似文献   

7.
王立  王跃清  王翰虎  陈梅 《计算机应用》2011,31(5):1400-1403
使用闪存作为存储介质成为提高数据库系统性能的一条新途径,为了解决闪存数据库系统存储管理技术中基于日志的更新策略存在查询效率低、日志区空间分配不合理、索引更新代价高等问题,提出了基于Bloom Filter的最新版本预测算法,引入记录定位器结构,提出日志概要结构和基于闪存更新查询代价评估模型的自适应机制。实验证明,该方法能够自适应地划分合理的日志区空间,有效提高查询性能,减少各种非聚集索引的更新代价。  相似文献   

8.
郭敬宇 《计算机工程》2004,30(14):103-104,145
介绍并改进了一种基于动态模型的,用于入侵检测系统的机器学习算法。该算法抽取出IP分组和TCP连接的23个属性域,经过训练得到规则集。结合改进后的模型,使用该规则集对1999 DARPA入侵检测离线评估数据集进行了测试,取得了较好的结果。  相似文献   

9.
基于R-Tree的空间查询代价模型研究   总被引:5,自引:0,他引:5  
本文对基于R-Tree的空间查询代价模型进行了探讨,分析了Y.Theodoridis等提出的矩形密度模型^[2,3],利用其结果提出了代价估计的概率模型,并通过实验验证了概率模型的估计精确度较矩形密度模型有了显著的提高.  相似文献   

10.
多表连接查询是并行数据库中的一种常用且重要的操作,然而基于传统遗传算法所制定的多表连接查询计划,往往存在查询响应时间长的缺陷。根据无共享并行数据库的特点,将一种新的代价估计模型引入到传统遗传算法中,并对传统遗传算法进行了改进。实验证明改进后的遗传算法能制定出更优的查询计划,从而减少多表连接时的查询响应时间。  相似文献   

11.
In a distributed spatial database system, a user may issue a query that relates two spatial relations not stored at the same site. Because of the sheer volume and complexity of spatial data, spatial joins between two spatial relations at different sites are expensive in terms of computational and transmission costs. In this paper, we address the problems of processing spatial joins in a distributed environment. We propose a semijoin-like operator, called the spatial semijoin, to prune away objects that do not contribute to the join result. This operator also reduces both the transmission and local processing costs for a later join operation. However, the cost of the elimination process must be taken into account, and we consider approaches to minimize these overheads. We also study and compare two families of distributed join algorithms that are based on the spatial semijoin operator. The first is based on multi-dimensional approximations obtained from an index such as the R-tree, and the second is based on single-dimensional approximations obtained from object mapping. We have conducted experiments on real data sets and report the results in this paper  相似文献   

12.
13.
PMR四分树空间索引结构在包含空间连接的空间数据库的查询中是很有效的,本文对桶载入PMR四分树的算法做了一些改进,即两种互补的技术:一种改进的插入算法和一种桶载入方法。该技术使得四分树的构造速度相对于传统的四分树构造方法大大提高。该方法可运用到许多基于规则划分的空间数据结构上,来加快它们的构造。  相似文献   

14.
Slot index spatial join   总被引:3,自引:0,他引:3  
Efficient processing of spatial joins is very important due to their high cost and frequent application in spatial databases and other areas involving multidimensional data. This paper proposes slot index spatial join (SISJ), an algorithm that joins a nonindexed data set with one indexed by an R-tree. We explore two optimization techniques that reduce the space requirements and the computational cost of SISJ and we compare it, analytically and experimentally, with other spatial join methods for two cases: 1) when the nonindexed input is read from disk and 2) when it is an intermediate result of a preceding database operator in a complex query plan. The importance of buffer splitting between consecutive join operators is also demonstrated through a two-join case study and a method that estimates the optimal splitting is proposed. Our evaluation shows that SISJ outperforms alternative methods in most cases and is suitable for limited memory conditions.  相似文献   

15.
This paper proposes a semi-greedy framework for optimizing multi-join queries in shared-nothing systems.The plan generated by the framework comprises several pipelines,each performing several joins.The framework determines the “optimal” number of joins to be performed in each pipeline.The decisions are made based on the cost estimation of the entire processing plan.Two existing optimization algorithms are extended under the framework.An analytical model is presented and used to compare the quality of plans produced by each optimization algorithm.Our study shows that the new algorithms outperform their counterparts that are not extended.  相似文献   

16.
Speeding up construction of PMR quadtree-based spatial indexes   总被引:5,自引:0,他引:5  
Spatial indexes, such as those based on the quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints, especially when the queries involve spatial joins. In this paper we present a number of techniques for speeding up the construction of quadtree-based spatial indexes, specifically the PMR quadtree, which can index arbitrary spatial data. We assume a quadtree implementation using the “linear quadtree”, a disk-resident representation that stores objects contained in the leaf nodes of the quadtree in a linear index (e.g., a B-tree) ordered based on a space-filling curve. We present two complementary techniques: an improved insertion algorithm and a bulk-loading method. The bulk-loading method can be extended to handle bulk-insertions into an existing PMR quadtree. We make some analytical observations about the I/O cost and CPU cost of our PMR quadtree bulk-loading algorithm, and conduct an extensive empirical study of the techniques presented in the paper. Our techniques are found to yield significant speedup compared to traditional quadtree building methods, even when the size of a main memory buffer is very small compared to the size of the resulting quadtrees. Edited by R. Sacks-Davis. Received: July 10, 2001 / Accepted: March 25, 2002 Published online: September 25, 2002  相似文献   

17.
软件测试是软件工程的一个重要阶段。在软件测试工作开展以前,恰当的估算软件测试的规模及成本,将使软件产品的质量得到大幅提高。提出一种基于算法模型的软件功能测试成本估算模型,给出了模型中参数的初步校准值,并在实践项目中进行了应用。实践表明,该模型在实践中可以较好地估算软件功能测试规模及成本,为测试计划的制定及测试工作的开展,起到积极作用。  相似文献   

18.
Caching Strategies for Spatial Joins   总被引:1,自引:0,他引:1  
  相似文献   

19.
A multiway spatial join combines information found in three or more spatial relations with respect to some spatial predicates. Motivated by their close correspondence with constraint satisfaction problems (CSPs), we show how multiway spatial joins can be processed by systematic search algorithms traditionally used for CSPs. This paper describes two different strategies, window reduction and synchronous traversal, that take advantage of underlying spatial indexes to prune the search space effectively. In addition, we provide cost models and optimization methods that combine the two strategies to compute more efficient execution plans. Finally, we evaluate the efficiency of the proposed techniques and the accuracy of the cost models through extensive experimentation with several query and data combinations. Received December 21, 1998; revised September 29, 1998.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号