首页 | 官方网站   微博 | 高级检索  
     

基于Hadoop的数值预报产品服务平台设计与实现
引用本文:李永生,曾沁,徐美红,石小英.基于Hadoop的数值预报产品服务平台设计与实现[J].应用气象学报,2015,26(1):122-128.
作者姓名:李永生  曾沁  徐美红  石小英
作者单位:广东省气象信息中心,广州 510080
基金项目:资助项目:广州市科技计划项目(2012Y2 00031,2013Y2 00053,2013Y2 00074),公益性行业(气象)科研专项(GYHY201106009),广东省气象局重点项目(2012A01)
摘    要:数值预报产品数据与日俱增,采用传统的关系型数据库对其进行存储和管理存在效率低和存储能力不足的问题。另外,基于文件的存储方式在数据存储处理、数据读取和算法计算等方面存在性能瓶颈。针对这一问题,基于Hadoop技术体系设计了分布式的数据存储模型,实现了数值预报产品数据的分布式存储和处理,开发了数值预报产品数据接入处理模块;并实现了基于Rest Web Service的获取数值预报产品要素场数据访问接口、时间序列数据访问接口、数据下载接口等业务应用接口。多业务用户的实际业务测试表明, 该平台在诸如数值预报产品气象数据处理和业务应用方面较传统技术架构具有一定优势。

关 键 词:Hadoop技术体系    气象数据    Web  Service接口
收稿时间:2014-05-19

Design and Implementation of NWP Data Service Platform Based on Hadoop Framework
Li Yongsheng,Zeng Qin,Xu Meihong and Shi Xiaoying.Design and Implementation of NWP Data Service Platform Based on Hadoop Framework[J].Quarterly Journal of Applied Meteorology,2015,26(1):122-128.
Authors:Li Yongsheng  Zeng Qin  Xu Meihong and Shi Xiaoying
Affiliation:Guangdong Provincial Meteorological Information Center, Guangzhou 510080
Abstract:As the numerical weather prediction (NWP) products increase in huge amounts every day, traditional relational database has the problem of low efficiency in archiving capacity and management, while file based storage faces performance challenges in long-time-series data accessing and massive computation of spatial-temporal data. Therefore, a three-tier software framework is designed, which implements distributed data storage model, parallel data access service and distributed computation for frequently used statistical algorithms based on Hadoop framework. Meteorological big data such as NWP products, radar 3D mosaic and satellite remote sensing are designed to be composed of metadata and data entity, which both are stored in Hbase data tables, and managed with HDFS file system. Metadata are defined by variable name, dimension, latitude, longitude, altitude and lead time etc., and data entity consists of row key, time stamp and column family to store the value at each grid point. A Rest (representational state transfer) Web Service is setup for direct NWP data acquisition, field data clipping and location based time-series accessing. File download services in "MICAPS", "surfer" and "json" format are also ready for the third-party meteorological software. System testing for data access of CHAF model shows that it costs only 12 seconds to write in 1000 NWP data fields each with 82503 grid points, and less than 4 seconds to read out the same amount of data from the distributed databases.Map-reduce scheme are implemented for computation of meteorological algorithms, e.g., Kalman filter and successive regression. Most of meteorological statistical algorithms are time independent, which make it possible that a task is divided into small sub-tasks according to data slicing on time series, and assigned to different computational nodes in map programs. Reduce programs are to gather and summarize the result of sub-task computation. With data amount and users increasing, Hadoop framework deployed on several X86 PC servers demonstrates performance advantage over single IBM power system. And flexible hardware architecture from 3 computational nodes to 9 nodes show steady and better data access efficiency with good speed-up ratio, which brings more confidence for practical use in weather forecast.Operational trial in multi-user environment further shows advantages of this cloud-like computing service over the traditional client-server model in meteorological data mining, such as NWP interpretation and model evaluation.
Keywords:Hadoop framework  meteorological data  Web Service interface
点击此处可从《应用气象学报》浏览原始摘要信息
点击此处可从《应用气象学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号