首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Social media sites and applications, including Facebook, YouTube, Twitter and blogs, have become major social media attractions today. The huge amount of information from this medium has become an attractive resource for organisations to monitor the opinions of users, and therefore, it is receiving a lot of attention in the field of sentiment analysis. Early work on sentiment analysis approached this problem at a document-level, where the overall sentiment was identified, rather than the details of the sentiment. This research took into account the use of an aspect-based sentiment analysis on Twitter in order to perform a finer-grained analysis. A new hybrid sentiment classification for Twitter is proposed by embedding a feature selection method. A comparison of the accuracy of the classification by the principal component analysis (PCA), latent semantic analysis (LSA), and random projection (RP) feature selection methods are presented in this paper. Furthermore, the hybrid sentiment classification was validated using Twitter datasets to represent different domains, and the evaluation with different classification algorithms also demonstrated that the new hybrid approach produced meaningful results. The implementations showed that the new hybrid sentiment classification was able to improve the accuracy performance from the existing baseline sentiment classification methods by 76.55, 71.62 and 74.24%, respectively.  相似文献   

2.
The emergence of Web 2.0 has drastically altered the way users perceive the Internet, by improving information sharing, collaboration and interoperability. Micro-blogging is one of the most popular Web 2.0 applications and related services, like Twitter, have evolved into a practical means for sharing opinions on almost all aspects of everyday life. Consequently, micro-blogging web sites have since become rich data sources for opinion mining and sentiment analysis. Towards this direction, text-based sentiment classifiers often prove inefficient, since tweets typically do not consist of representative and syntactically consistent words, due to the imposed character limit. This paper proposes the deployment of original ontology-based techniques towards a more efficient sentiment analysis of Twitter posts. The novelty of the proposed approach is that posts are not simply characterized by a sentiment score, as is the case with machine learning-based classifiers, but instead receive a sentiment grade for each distinct notion in the post. Overall, our proposed architecture results in a more detailed analysis of post opinions regarding a specific topic.  相似文献   

3.
以文本颗粒度为视角,从情感词抽取、语料库和情感词典构建、评价对象与意见持有者分析、篇章级情感分析、实际应用五个方面对文本情感分析文献进行了梳理,并做出必要评述。指出当前情感分析系统的准确率普遍不高,进一步研究的重点在于:自然语言处理的研究成果在文本情感倾向分析中更广泛和贴切的应用;选取文本情感倾向分类的特征和方法;利用现有语言工具和相关资源,规范、快速地构造语言工具和相关资源并应用。  相似文献   

4.
Naz  Huma  Ahuja  Sachin  Kumar  Deepak  Rishu 《Multimedia Tools and Applications》2021,80(8):11443-11458
Multimedia Tools and Applications - Sentiment analysis refers to the interpretation and computational study of emotions, opinions and appraisals within the text data using text analysis methods. A...  相似文献   

5.
中文文本情感分析研究综述   总被引:3,自引:0,他引:3  
对中文文本情感分析的研究进行了综述。将情感分类划分为信息抽取和情感识别两类任务,并分别介绍了各自的研究进展;总结了情感分析的应用现状,最后提出了存在的问题及不足。  相似文献   

6.
文本情感分析领域内的特征加权一般考虑两个影响因子:特征在文档中的重要性(ITD)和特征在表达情感上的重要性(ITS)。结合该领域内两种分类准确率较高的监督特征加权算法,提出了一种新的ITS算法。新算法同时考虑特征在一类文档集里的文档频率(在特定的文档集里,出现某个特征的文档数量)及其占总文档频率的比例,使主要出现且大量出现在同一类文档集里的特征获得更高的ITS权值。实验证明,新算法能提高文本情感分类的准确率。  相似文献   

7.
中文文本情感分析综述   总被引:5,自引:0,他引:5  
魏韡  向阳  陈千 《计算机应用》2011,31(12):3321-3323
由于主观性文本有很多应用价值,情感分析近年来引起了很多研究人员的兴趣.情感分析是对主观性文本进行挖掘与分析,获取有用的知识和信息.针对中文文本情感分析的研究现状与进展进行总结.首先按粒度层次,从词语级、语句级、篇章级三个不同粒度层次细致地介绍相关的技术,再按文本的类型,分析了产品评论和新闻评论的研究进展.接着介绍了中文...  相似文献   

8.
Swathi  T.  Kasiviswanath  N.  Rao  A. Ananda 《Applied Intelligence》2022,52(12):13675-13688
Applied Intelligence - Stock Price Prediction is one of the hot research topics in financial engineering, influenced by economic, social, and political factors. In the present stock market, the...  相似文献   

9.
Multimedia Tools and Applications - The National Examination (UN) is a system of evaluation of education standards for elementary and secondary schools conducted nationally and is also used to...  相似文献   

10.
The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations.  相似文献   

11.
Sentiment analysis in text mining is a challenging task. Sentiment is subtly reflected by the tone and affective content of a writer’s words. Conventional text mining techniques, which are based on keyword frequencies, usually run short of accurately detecting such subjective information implied in the text. In this paper, we evaluate several popular classification algorithms, along with three filtering schemes. The filtering schemes progressively shrink the original dataset with respect to the contextual polarity and frequent terms of a document. We call this approach “hierarchical classification”. The effects of the approach in different combination of classification algorithms and filtering schemes are discussed over three sets of controversial online news articles where binary and multi-class classifications are applied. Meanwhile we use two methods to test this hierarchical classification model, and also have a comparison of the two methods.  相似文献   

12.
王昆  郑毅  方书雅  刘守印 《计算机应用》2020,40(10):2838-2844
方面级情感分析旨在分类出文本在不同方面的情感倾向。在长文本的方面级情感分析中,由于长文本存在的冗余和噪声问题,导致现有的方面级情感分析算法对于长文本中方面相关信息的特征提取不够充分,分类不精准;而在方面分层为粗粒度和细粒度方面的数据集上,现有的解决方案没有利用粗粒度方面中的信息。针对以上问题,提出基于文本筛选和改进BERT的算法TFN+BERT-Pair-ATT。该算法首先利用长短时记忆网络(LSTM)和注意力机制相结合的文本筛选网络(TFN)从长文本中直接筛选出与粗粒度方面相关的部分语句;然后将部分语句按次序进行组合,并与细粒度方面相结合输入至在BERT上增加注意力层的BERT-Pair-ATT中进行特征提取;最后使用Softmax进行情感分类。通过与基于卷积神经网络(CNN)的GCAE(Gated Convolutional Network with Aspect Embedding)、基于LSTM的交互式注意力模型(IAN)等经典模型相比,该算法在验证集上的相关评价指标分别提高了3.66%和4.59%,与原始BERT模型相比提高了0.58%。实验结果表明,基于文本筛选和改进BERT的算法在长文本方面级情感分析任务中具有较大的价值。  相似文献   

13.
王昆  郑毅  方书雅  刘守印 《计算机应用》2005,40(10):2838-2844
方面级情感分析旨在分类出文本在不同方面的情感倾向。在长文本的方面级情感分析中,由于长文本存在的冗余和噪声问题,导致现有的方面级情感分析算法对于长文本中方面相关信息的特征提取不够充分,分类不精准;而在方面分层为粗粒度和细粒度方面的数据集上,现有的解决方案没有利用粗粒度方面中的信息。针对以上问题,提出基于文本筛选和改进BERT的算法TFN+BERT-Pair-ATT。该算法首先利用长短时记忆网络(LSTM)和注意力机制相结合的文本筛选网络(TFN)从长文本中直接筛选出与粗粒度方面相关的部分语句;然后将部分语句按次序进行组合,并与细粒度方面相结合输入至在BERT上增加注意力层的BERT-Pair-ATT中进行特征提取;最后使用Softmax进行情感分类。通过与基于卷积神经网络(CNN)的GCAE(Gated Convolutional Network with Aspect Embedding)、基于LSTM的交互式注意力模型(IAN)等经典模型相比,该算法在验证集上的相关评价指标分别提高了3.66%和4.59%,与原始BERT模型相比提高了0.58%。实验结果表明,基于文本筛选和改进BERT的算法在长文本方面级情感分析任务中具有较大的价值。  相似文献   

14.
目前的情感分析研究大部分仅局限于能够明显地表达意见的主观性文本,却没有对一些隐含地表达情感的文本进行分析.针对这一不足,提出一种基于条件随机场(CRFs)模型的意见挖掘中维吾尔语文本隐式情感分析方法.利用互信息(MI)衡量上下文的依赖度,结合词法、语境依赖词、标点符号和习语等特征用于隐式情感分析.在特征选择时,通过对信息增益(IG)进行改进,解决语料中数据集不平衡的问题.该方法用于维吾尔语文本隐式情感分析的准确率为77.11%,召回率为78.37%,表明了其在意见挖掘中隐式情感分析任务上的有效性.  相似文献   

15.
马远 《计算机应用研究》2021,38(6):1753-1758
方面级别的文本情感分析旨在针对一个句子中具体的方面单词来判断其情感极性.针对方面单词可能由多个单词组成、平均化所有单词的词向量容易导致语义错误或混乱,不同的文本单词对于方面单词的情感极性判断具有不同的影响力的问题,提出一种融合左右的双边注意力机制的方面级别的文本情感分析模型.首先,设计内部注意力机制来处理方面单词,并根据方面单词和上下文单词设计了双边交互注意力机制,最后将双边交互注意力的处理结果与方面单词处理值三个部分级联起来进行分类.模型在SemEval 2014中两个数据集上进行了实验,分别实现了81.33%和74.22%的准确率,相比较于机器学习和结合注意力机制的各种模型取得了更好的效果.  相似文献   

16.
李铮  陈莉  张爽 《计算机应用研究》2021,38(8):2303-2307
目前情感分析模型通常使用word2vec、GloVe等方法生成静态词向量,并且传统的卷积或循环深度模型无法完整地关注上下文,提取特征不充分,影响情感判断.针对上述问题,提出基于ELMo(embedding from lan-guage model)和双向自注意力网络(bidirectional self-attention network,Bi-SAN)的中文文本情感分析模型.首先通过ELMo语言模型训练得到融合词语本身和上下文信息的词向量,解决了一词多义的问题;同时使用预训练的skip-gram算法代替随机初始化的ELMo模型的嵌入层,提高模型的收敛速度;之后使用Bi-SAN提取特征,由于自注意力机制,Bi-SAN可以完整地关注每个词的上下文,提取特征更为全面.同现有的多个情感分析模型对比,该模型在酒店评论数据集上和NLPCC2014 task2中文数据集取得了更高的F1值,验证了模型的有效性.  相似文献   

17.
随着信息化技术的不断提升,各类社交平台上带有倾向性的图文数据量快速增长,图文融合的情感分析受到广泛关注,单一的情感分析方法不再能够满足多模态数据的需求.针对图文情感特征提取与融合的技术难题,首先,列举了目前应用较广的图文情感分析数据集,介绍了文本特征和图片特征的提取方式;然后,重点研究了当前图文特征融合方式,简述了在图...  相似文献   

18.
在文本情感分析时,使用有监督的机器学习方法的不足是需要大量的带标签的文本数据,而无监督的文本聚类方法可以克服这一问题。对于文本情感聚类,在节省数据资源的同时,也存在聚类结果的不确定性问题。给出了情感维度的形式化描述,并将观点词识别技术应用于情感维度的判别中。在此基础上,利用获得的情感维度,对评论文本进行情感聚类,有效地解决情感聚类结果的不确定性问题。在4个领域的英文产品评论数据上进行实验,结果表明该方法在自动识别情感聚类维度中是有效的,并得到了满意的情感聚类结果。  相似文献   

19.
This paper describes a support vector machine-based approach to different tasks related to sentiment analysis in Twitter for Spanish. We focus on parameter optimization of the models and the combination of several models by means of voting techniques. We evaluate the proposed approach in all the tasks that were defined in the five editions of the TASS workshop, between 2012 and 2016. TASS has become a framework for sentiment analysis tasks that are focused on the Spanish language. We describe our participation in this competition and the results achieved, and then we provide an analysis of and comparison with the best approaches of the teams who participated in all the tasks defined in the TASS workshops. To our knowledge, our results exceed those published to date in the sentiment analysis tasks of the TASS workshops.  相似文献   

20.
Term weighting is a strategy that assigns weights to terms to improve the performance of sentiment analysis and other text mining tasks. In this paper, we propose a supervised term weighting scheme based on two basic factors: Importance of a term in a document (ITD) and importance of a term for expressing sentiment (ITS), to improve the performance of analysis. For ITD, we explore three definitions based on term frequency. Then, seven statistical functions are employed to learn the ITS of each term from training documents with category labels. Compared with the previous unsupervised term weighting schemes originated from information retrieval, our scheme can make full use of the available labeling information to assign appropriate weights to terms. We have experimentally evaluated the proposed method against the state-of-the-art method. The experimental results show that our method outperforms the method and produce the best accuracy on two of three data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号