首页 | 官方网站   微博 | 高级检索  
     

基于Hive的水利普查数据仓库
引用本文:陈龙,万定生,顾昕辰.基于Hive的水利普查数据仓库[J].计算机与现代化,2014,0(5):127-130.
作者姓名:陈龙  万定生  顾昕辰
作者单位:河海大学计算机与信息学院,江苏南京211100
基金项目:基金项目:国家自然科学基金资助项目(51079040);水利部948项目(201016)
摘    要:针对水利普查数据海量、多维的特点,研究近年来在"大数据"概念下发展迅速的Hadoop与Hive,结合传统数据仓库在多维数据分析方面的成熟技术,提出基于Hive的水利普查数据仓库的构建方法,描述数据仓库系统的架构,并根据Hive的设计特点,通过分桶、消减维度表和冗余事实表的方法来改进传统的多维分析模型,最后搭建集群系统对水利普查数据集进行查询与分析测试。测试结果表明该数据仓库可以满足海量多维水利普查数据的存储与查询要求。

关 键 词:Hive  数据仓库  水利普查  模型优化  大规模数据处理

Water Census Data Warehouse Based on Hive
CHEN Long,WAN Ding-sheng,GU Xin-chen.Water Census Data Warehouse Based on Hive[J].Computer and Modernization,2014,0(5):127-130.
Authors:CHEN Long  WAN Ding-sheng  GU Xin-chen
Affiliation:(College of Computer and Information, Hohai University, Nanjing 211100, China)
Abstract:For the characters that water census data is of large volumes and high dimension, studying Hadoop and Hive which have a quick development recently in the "big data" concept and combining mature technology in multidimensional data analysis using traditional data warehouse, this article proposes a construction method of water census data warehouse based on Hive. This paper describes the architecture of data warehouse system, improves multidimensional model by dimension table reduction, fact table redundancy and Hive' s bucket method, then carries on queries and analysis to water census data set on Hadoop cluster system. Experimental results show that the data warehouse meets the f storage and query requirements of massive multidimensional water census data.
Keywords:Hive  data warehouse  water census  model optimization  large data processing
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号