首页 | 官方网站   微博 | 高级检索  
     

基于HBase的气象地面分钟数据分布式存储系统
引用本文:陈东辉,曾乐,梁中军,肖卫青.基于HBase的气象地面分钟数据分布式存储系统[J].计算机应用,2014,34(9):2617-2621.
作者姓名:陈东辉  曾乐  梁中军  肖卫青
作者单位:国家气象信息中心 系统工程室,北京 100081
基金项目:国家气象信息中心青年科技基金资助项目
摘    要:针对气象地面分钟数据要素多样、信息量大、产生频次高等特点,传统的关系型数据库系统在存储和管理数据上出现负载饱满、读写性能不理想等问题。结合对分布式数据库HBase的存储模型的研究,行主键(row key)采用时间加站号的方式设计了气象分钟数据存储结构模型,实现对海量气象数据的分布式存储和元信息管理。对HBase的唯一索引在面对气象业务的复杂查询用例时响应时间过长的问题,使用搜索引擎solr提供的API接口并参考气象业务中的查询用例对相关字段建立辅助索引,来满足业务检索时效。实验结果表明,该系统具有很好的存储能力和检索效率,入库效率最高可达每秒34000条,并且在常规查询用例的结果返回时效达到毫秒级,能够满足大规模气象数据在业务应用中对存储和查询时效的性能要求。

关 键 词:分钟数据  分布式存储  Hadoop  solr  HBase  辅助索引
收稿时间:2014-03-12
修稿时间:2014-04-19

HBase-based distributed storage system for meteorological gound minute data
CHEN Donghui,ZENG Le,LIANG Zhongjun,XIAO Weiqing.HBase-based distributed storage system for meteorological gound minute data[J].journal of Computer Applications,2014,34(9):2617-2621.
Authors:CHEN Donghui  ZENG Le  LIANG Zhongjun  XIAO Weiqing
Affiliation:Engineering System Division, National Meteorological Information Center, Beijing 100081, China
Abstract:The meteorological ground minute data has characteristics including various elements, large amounts of information and high frequency generation, therefore the traditional relational database system has some problems such as server overload and low read and write performance in data storage and management. With the research of storage model of distributed databases HBase, the database model of the meteorological ground minute data was proposed to achieve distributed storage of massive meteorological data and meta-information management, in which the row key was designed by the method of time plus station number. When processing the complex meteorological query case, the response time of unique index in HBase is too long. To address this defect and meet the requirements of retrieval time efficiency, with considering the query case, API interface offered by search engine solr was used to establish secondary index for related field. The experimental results show that this system has high efficiency of storage and index, the maximum storage efficiency can be up to 34000 records/s. When generic query cases return, the time consuming can be down to millisecond level. This method can satisfy the performance requirements of large-scale meteorological data in business applications.
Keywords:minute data  distributed storage  Hadoop  solr  HBase  secondary index
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号