首页 | 官方网站   微博 | 高级检索  
     

基于Greenplum数据库的查询优化
引用本文:邹承明,谢义,吴佩.基于Greenplum数据库的查询优化[J].计算机应用,2018,38(2):478-482.
作者姓名:邹承明  谢义  吴佩
作者单位:1. 交通物联网技术湖北省重点实验室(武汉理工大学), 武汉 430070;2. 武汉理工大学 计算机科学与技术学院, 武汉 430070
基金项目:国家自然科学基金资助项目(61503289);湖北省科技支撑计划项目(2015BAA120,2015BCE068)。
摘    要:针对分布式数据库查询效率随着数据规模的增大而降低的问题,以Greenplum分布式数据库为研究对象,从优化查询路径的角度提出一个基于代价的最优查询计划生成方法。首先,该方法设计一种有效的代价模型来估算查询代价;然后,采用并行最大最小蚁群算法来搜索具有最小查询代价的连接顺序,即最优连接顺序;最后,根据Greenplum数据库对查询计划中不同操作的默认最优选择得到最优查询计划。采用该方法在自主生成的数据集与事务处理性能理事会测试基准(TPC-H)的标准数据集上进行了多组实验。实验结果表明,所提出的优化方法能有效地搜索出最优解,获得最优的查询计划,从而提升Greenplum数据库的查询效率。

关 键 词:分布式数据库  Greenplum数据库  最优查询计划  代价模型  最优连接顺序  
收稿时间:2017-07-31
修稿时间:2017-09-08

Query optimization based on Greenplum database
ZOU Chengming,XIE Yi,WU Pei.Query optimization based on Greenplum database[J].journal of Computer Applications,2018,38(2):478-482.
Authors:ZOU Chengming  XIE Yi  WU Pei
Affiliation:1. Hubei Key Laboratory of Transportation Internet of Things(Wuhan University of Technology), Wuhan Hubei 430070, China;2. School of Computer Science and Technology, Wuhan University of Technology, Wuhan Hubei 430070, China
Abstract:In order to solve the problem that the query efficiency of distributed database decreases with the increase of data scale, the Greenplum distributed database was taken as the research object, and a cost-based optimal query plan generation scheme was proposed from the perspective of optimizing the query path. Firstly, an effective cost model was designed to estimate the query cost. The parallel maximum and minimum ant colony algorithm was then used to search the join order with the minimum query cost, i.e. the optimal join order. Finally, the optimal query plan was obtained based on the Greenplum database's default optimal choice for different operations in the query plan. Multiple experiments were carried out on the self-generated data set and Transaction Processing Performance Council Benchmark H (TPC-H) standard data set by using the proposed scheme. The experimental results show that the proposed optimization scheme can effectively search out the optimal solution and obtain the optimal query plan, so as to improve the query efficiency of Greenplum database.
Keywords:distributed database                                                                                                                        Greenplum database                                                                                                                        optimal query plan                                                                                                                        cost model                                                                                                                        optimal join order
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号