首页 | 官方网站   微博 | 高级检索  
     

基于情感词向量的微博情感分类
引用本文:杜 慧,徐学可,伍大勇,刘 悦,余智华,程学旗.基于情感词向量的微博情感分类[J].中文信息学报,2017,31(3):170-176.
作者姓名:杜 慧  徐学可  伍大勇  刘 悦  余智华  程学旗
作者单位:1. 中国科学院计算技术研究所,中国科学院网络数据科学与技术重点实验室,北京 100190;
2. 中国科学院大学,北京 100190
基金项目:国家973计划(2014CB340406,2013CB329602);国家863计划(2014AA015204);国家自然科学基金(61232010)
摘    要:该文提出了一种基于情感词向量的情感分类方法。词向量采用连续实数域上的固定维数向量来表示词汇,能够表达词汇丰富的语义信息。词向量的学习方法,如word2vec,能从大规模语料中通过上下文信息挖掘出潜藏的词语间语义关联。本文在从语料中学习得到的蕴含语义信息的词向量基础上,对其进行情感调整,得到同时考虑语义和情感倾向的词向量。对于一篇输入文本,基于情感词向量建立文本的特征表示,采用机器学习的方法对文本进行情感分类。该方法与基于词、N-gram及原始word2vec词向量构建文本表示的方法相比,情感分类准确率更高、性能和稳定性更好。

关 键 词:情感分析  情感分类  词向量  机器学习  

A Sentiment Classification Method Based on Sentiment-Specific Word Embedding
DU Hui,XU Xueke,WU Dayong,LIU Yue,YU Zhihua,CHENG Xueqi.A Sentiment Classification Method Based on Sentiment-Specific Word Embedding[J].Journal of Chinese Information Processing,2017,31(3):170-176.
Authors:DU Hui  XU Xueke  WU Dayong  LIU Yue  YU Zhihua  CHENG Xueqi
Affiliation:1.CAS Key Laboratory of Newtwork Data Science and Technology, Institute of Computing Technology,
Chinese Academy Sciences, Beijing 100190, China;
2. University of Chinese Academy of Sciences, Beijing 100190, China
Abstract:We present a method for sentiment classification based on sentiment-specific word embedding (SSWE). Word embedding is the distributed vector representation of a word with fixed length in real topological space. Algorithms for learning word embedding, like word2vec, obtain this representation from large un-annotated corpus, without considering sentiment information. We make sentiment improvement for the initial word embedding and get the sentiment-specific word embedding that contains both syntactic and sentiment information.Then text representations are built based on sentiment-specific word embeddings. Sentiment polarities of texts are obtained through machine learning approaches. Experiments show that the presented algorithm performs better than sentiment classification method based on texts modeling by word, N-gram and word embeddings from word2vec.
Keywords:sentiment analysis  sentiment classification  word embedding  machine learning  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号