一种基于Map/Reduce分布式计算的恒星光谱分类方法 A Method of Stellar Spectral Classification Based on Map/Reduce Distributed Computing期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种基于Map/Reduce分布式计算的恒星光谱分类方法

引用本文：	潘景昌,王杰,姜斌,罗阿理,韦鹏,郑强.一种基于Map/Reduce分布式计算的恒星光谱分类方法[J].光谱学与光谱分析,2016,36(8):2651-2654.

作者姓名：	潘景昌王杰姜斌罗阿理韦鹏郑强

作者单位：	1. 山东大学(威海)机电与信息工程学院，山东威海 264209 2. 中国科学院光学天文重点实验室，国家天文台，北京 100012 3. 烟台大学计算机与控制工程学院，山东烟台 264005

基金项目：	国家自然科学基金项目(U1431102

摘要：	天体光谱中蕴含着非常丰富的天体物理信息，通过对光谱的分析，可以得到天体的物理信息、化学成分以及天体的大气参数等。随着LAMOST和SDSS等大规模巡天望远镜的实施，将会产生海量的光谱数据，尤其是LAMOST正式运行后，每个观测夜产生大约2~4万条光谱数据。如此海量的光谱数据对光谱的快速有效的处理提出了更高的要求。恒星光谱的自动分类是光谱处理的一项基本内容，该研究主要工作就是研究海量恒星光谱的自动分类技术。Lick线指数是在天体光谱上定义的一组用以描述光谱中谱线强度的标准指数，代表光谱的物理特性，以每个线指数最突出的吸收线命名，是一个相对较宽的光谱特征。研究了基于Lick线指数的贝叶斯光谱分类方法，对F，G，K三类恒星进行分类。首先，计算各类光谱的Lick线指数作为特征向量，然后利用贝叶斯分类算法对三类恒星进行分类。针对海量光谱的情况，基于Hadoop平台实现了Lick线指数的计算，以及利用贝叶斯决策进行光谱分类的方法。利用Hadoop HDFS高吞吐率和高容错性的特点，结合Hadoop MapReduce编程模型的并行优势，提高了对大规模光谱数据的分析和处理效率。该研究的创新点为：(1) 以Lick线指数作为特征，基于贝叶斯算法实现恒星光谱分类;(2) 基于Hadoop MapReduce分布式计算框架实现Lick线指数的并行计算以及贝叶斯分类过程的并行化。
关键词：	Lick线指数恒星光谱分类 Hadoop
收稿时间：	2015-03-02
A Method of Stellar Spectral Classification Based on Map/Reduce Distributed Computing

PAN Jing-chang,WANG Jie,JIANG Bin,LUO A-li,WEI Peng,ZHENG Qiang.A Method of Stellar Spectral Classification Based on Map/Reduce Distributed Computing[J].Spectroscopy and Spectral Analysis,2016,36(8):2651-2654.

Authors:	PAN Jing-chang WANG Jie JIANG Bin LUO A-li WEI Peng ZHENG Qiang

Affiliation:	1. School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai, Weihai 264209, China2. Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China3. College of Computer and Control Engineering, Yantai University, Yantai 264005, China

Abstract:	Celestial spectrum contains a great deal of astrophysical information.Through the analysis of spectra,people can get the physical information of celestial bodies,as well as their chemical composition and atmospheric parameters.With the implementation of LAMOST,SDSS telescopes and other large-scale surveys,massive spectral data will be produced,especially along with the formal operation of LAMOST,2 000 to 4 000 spectral data will be generated each observation night.It requires more efficient processing technology to cope with such massive spectra.Automatic classification of stellar spectra is a basic con-tent of spectral processing.The main purpose of this paper is to research the automatic classification of massive stellar spectra. The Lick index is a set of standard indices defined in astronomical spectra to describe the spectral intensity of spectral lines, which represent the physical characteristics of spectra.Lick index is a relatively wide spectral characteristics,each line index is named after the most prominent absorption line.In this paper,the Bayesian method is used to classify stellar spectra based on Lick line index,which divides stellar spectra to three subtypes:F,G,K.First of all,Lick line index of spectra is calculated as the characteristic vector of spectra,and then Bayesian method is used to classify these spectra.For massive spectra,the compu-tation of Lick indices and the spectral classification using Bayesian decision method are implemented on Hadoop.With use of the high throughput and good fault tolerance of HDFS,combined with the advantages of MapReduce parallel programming model, the efficiency of analysis and processing for massive spectral data have been improved significantly.The main innovative contri-butions of this thesis are as follows.(1)Using Lick indices as the characteristic to classify stellar spectra based on Bayesian deci-sion method.(2)Implementing parallel computation of Lick indices and parallel classification of stellar spectra using Bayesian based on Hadoop MapReduce distributed computing framework.

Keywords:	Lick line index Stellar spectral classification Hadoop
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《光谱学与光谱分析》浏览原始摘要信息
	点击此处可从《光谱学与光谱分析》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏