首页 | 官方网站   微博 | 高级检索  
     

融合显性和隐性特征的中文微博情感分析
引用本文:陈铁明,缪茹一,王小号.融合显性和隐性特征的中文微博情感分析[J].中文信息学报,2016,30(4):184-192.
作者姓名:陈铁明  缪茹一  王小号
作者单位:浙江工业大学 计算机科学与技术学院,浙江 杭州 310023
基金项目:国家自然科学基金重点支持项目-NSFC-浙江两化融合联合基金(U1509214)
摘    要:微博情感分析是研究社交网络舆情的一项关键技术。微博表情符号和情感词汇等是一类直观显性的情感特征,而微博的内容语义则可视为隐性特征,且对情感判定往往具有决定性作用,因此本文提出将两类特征因素融合的微博情感分析方法。首先构建情感分析词典、网络用语词典以及表情符号库,定义微博频繁特征词集,再根据频繁特征词集,利用最大频繁项集获得微博初始情感簇;针对初始簇间存在文本重叠情况,提出基于短文本扩展语义隶属度的簇间重叠消减算法,获得完全分离的初始簇;最后根据簇语义相似度矩阵,给出一种凝聚式情感聚类方法。利用NLPCC2013 评测所提供的训练语料进行情感分类实验,说明了分析该文方法的性能优势,并以2014年3月8日马航事件微博数据为例,给出了利用微博情感分析公众随事态发展的情感变化,说明了该文方法的实用效果。

关 键 词:表情符号  情感词典  语义  频繁项集  聚类  

Chinesemicro-blog Sentiment Analysis using Both Explicit and Implicit Text Features
CHEN Tieming,MIAO Ruyi,WANG Xiaohao.Chinesemicro-blog Sentiment Analysis using Both Explicit and Implicit Text Features[J].Journal of Chinese Information Processing,2016,30(4):184-192.
Authors:CHEN Tieming  MIAO Ruyi  WANG Xiaohao
Affiliation:College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, Zhejiang 310023,China
Abstract:Micro-blog sentiment analysis is a key technique of public opinion research for social networks. Micro-blog emoticons and sentiment words are both of intuitive called as explicit emotion features, while the content semantics are called implicit features which sometimes are very important for micro-blog emotion discrimination. Therefore, in this paper, a new systematic methodology for sentiment analysis is proposed using both explicit and implicit emotion features. At first, the sentiment analysis dictionary, the glossary of social networking terms, as well as the emoticon library, are all initialized. Then, the text micro-blog frequent word sets are defined. According to the feature set of words, the initial micro-blog clusters can be directly generated depending on the maximum frequent item sets. Furthermore, as to solve the micro-blog overlap problem between multiple initial clusters, an efficient elimination method is proposed employing the extended membership degree of the short-message semantic. Finally, the semantic similarity matrix for each separated cluster is defined, based on which a hierarchical sentiment clustering for micro-blogs is conducted. Taking the well-known contest NLPCC2013 in China as instance, the efficiency of our proposed method is proved by the comparing experiments. At last, a real-world case is also done to exactly show the emotion change from Chinese micro-blogs for the Malaysia Airlines Disappear Incident during March 8 to Spril 8, 2014
Keywords:emoticons  emotion dictionary  semantic  frequent item set  clustering  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号