首页 | 官方网站   微博 | 高级检索  
     

改进的CBOW情感信息获取研究
引用本文:曹军博,叶霞,许飞翔,尹列东.改进的CBOW情感信息获取研究[J].计算机工程与应用,2020,56(9):142-147.
作者姓名:曹军博  叶霞  许飞翔  尹列东
作者单位:火箭军工程大学 作战保障学院,西安 710025
摘    要:大数据时代,文本的情感倾向对于文本潜在价值挖掘具有重要意义,然而人工方法很难有效挖掘网络上评论文本的潜在价值,随着计算机技术的快速发展,这一问题得到了有效解决。在文本情感分析中,获取词语的情感信息对于情感分析至关重要,词向量方法一般仅对词语的语法语义进行建模,但是忽略了词语的情感信息,无法更好地进行情感分析。通过TF-IDF算法模型获得赋权矩阵,构建停用词表,同时根据赋权矩阵生成Huffman树作为改进的CBOW算法的输入,引入情感词典生成情感标签辅助词向量生成,使词向量具有情感信息。实验结果表明,提出的方法对评论文本中获得的词向量能够较好地表达情感信息,情感分类结果优于传统模型。因此,该模型在评论文本情感分析中可以有效提升文本情感分类效果。

关 键 词:词向量  CBOW模型  TF-IDF模型  情感分析  

Improved CBOW Emotional Information Acquisition Research
CAO Junbo,YE Xia,XU Feixiang,YIN Liedong.Improved CBOW Emotional Information Acquisition Research[J].Computer Engineering and Applications,2020,56(9):142-147.
Authors:CAO Junbo  YE Xia  XU Feixiang  YIN Liedong
Affiliation:Academy of Combat Support, Rocket Force University of Engineering, Xi’an 710025, China
Abstract:In the era of big data, the emotional tendency of text is a great significance for the potential value of text mining.However, it is difficult for artificial methods to effectively exploit the potential value of comment text on the network.With the rapid development of computer technology, this problem has been effectively solved. In text sentiment analysis,acquiring emotional information of words is crucial for sentiment analysis. Word vector methods generally only model the grammatical semantics of words, but ignore the emotional information of words and cannot analyze emotions better. The weighting matrix is generated by TF-IDF algorithm model, the stop word list is constructed, and the Huffman tree is generated according to the weighting matrix as the input of the improved CBOW algorithm. The sentiment dictionary is introduced to generate the emotional label for assisting word vector generation, so that the word vector has emotional information. The experimental results show that the method can express the sentiment information well in the word vector obtained in the comment text, and the sentiment classification result is better than the traditional model. Therefore, the model can effectively improve the text sentiment classification effect in the emotional analysis of comment texts.
Keywords:word vector  Continuous Bag-of-Word(CBOW)model  Term Frequency-Inverse Document Frequency(TF-IDF) model  sentiment analysis
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号