首页 | 官方网站   微博 | 高级检索  
     

智能数据分区与布局研究
引用本文:刘欢,刘鹏举,王天一,何雨琪,孙路明,李翠平,陈红.智能数据分区与布局研究[J].软件学报,2022,33(10):3819-3843.
作者姓名:刘欢  刘鹏举  王天一  何雨琪  孙路明  李翠平  陈红
作者单位:数据工程与知识工程教育部重点实验室(中国人民大学), 北京 100872;中国人民大学 信息学院, 北京 100872
基金项目:北京市自然科学基金(4212022);国家重点研发计划(2018YFB1004401);国家自然科学基金(61772537,61772536,62072460,62076245)
摘    要:大数据时代,数据规模庞大,由数据进行驱动的应用分析场景日益增多.如何快速、高效地从这些海量数据中提取出用以分析决策的信息,给数据库系统带来重大挑战.同时,现代商业分析决策对分析数据的实时性要求数据库系统能够同时快速处理ACID事务和复杂的分析查询.然而,传统的数据分区粒度太粗,且不能适应动态变化的复杂分析负载;传统的数据布局单一,不能应对现代大量增加的混合事务分析应用场景.为了解决以上问题,“智能数据分区与布局”成为当前的研究热点之一,它通过数据挖掘、机器学习等技术抽取工作负载的有效特征,设计最佳的分区策略来避免扫描大量不相关的数据,指导布局结构设计以适应不同类型的工作负载.首先介绍了智能数据分区与布局的相关背景知识,然后对智能数据分区与布局技术的研究动机、发展趋势、关键技术进行详细的阐述.最后,对智能数据分区与布局技术的研究前景做出总结与展望.

关 键 词:数据库系统  分区策略  布局策略  机器学习
收稿时间:2021/1/19 0:00:00
修稿时间:2021/4/15 0:00:00

Survey of Intelligent Partition and Layout Technology in Database System
LIU Huan,LIU Peng-Ju,WANG Tian-Yi,HE Yu-Qi,SUN Lu-Ming,LI Cui-Ping,CHEN Hong.Survey of Intelligent Partition and Layout Technology in Database System[J].Journal of Software,2022,33(10):3819-3843.
Authors:LIU Huan  LIU Peng-Ju  WANG Tian-Yi  HE Yu-Qi  SUN Lu-Ming  LI Cui-Ping  CHEN Hong
Affiliation:Key Laboratory of Data Engineering and Knowledge Engineering of the Ministry of Education (Renmin University of China), Beijing 100872, China;School of Information, Renmin University of China, Beijing 100872, China
Abstract:In the era of big data, there are more and more application analysis scenarios driven by large-scale data. How to quickly and efficiently extract the information for analysis and decision-making from these massive data brings great challenges to the database system. At the same time, the real-time performance of analysis data in modern business analysis and decision-making requires that the database system can process ACID transactions and complex analysis queries. However, the traditional data partition granularity is too coarse, and cannot adapt to the dynamic changes of complex analysis load; the traditional data layout is single, and cannot cope with the modern increasing mixed transaction analysis application scenarios. In order to solve the above problems, "intelligent data partition and layout" has become one of the current research hotspots. It extracts the effective characteristics of workload through data mining, machine learning, and other technologies, and design appropriate partition strategy to avoid scanning a large number of irrelevant data and guide the layout structure design to adapt to different types of workloads. This paper first introduces the background knowledge of data partition and layout techniques, and then elaborates the research motivation, development trend, and key technologies of intelligent data partition and layout. Finally, the research prospect of intelligent data partition and layout is summarized and prospected.
Keywords:database system  partitioning strategy  layout strategy  machine learning
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号