共查询到20条相似文献,搜索用时 0 毫秒
1.
Social media sites and applications, including Facebook, YouTube, Twitter and blogs, have become major social media attractions today. The huge amount of information from this medium has become an attractive resource for organisations to monitor the opinions of users, and therefore, it is receiving a lot of attention in the field of sentiment analysis. Early work on sentiment analysis approached this problem at a document-level, where the overall sentiment was identified, rather than the details of the sentiment. This research took into account the use of an aspect-based sentiment analysis on Twitter in order to perform a finer-grained analysis. A new hybrid sentiment classification for Twitter is proposed by embedding a feature selection method. A comparison of the accuracy of the classification by the principal component analysis (PCA), latent semantic analysis (LSA), and random projection (RP) feature selection methods are presented in this paper. Furthermore, the hybrid sentiment classification was validated using Twitter datasets to represent different domains, and the evaluation with different classification algorithms also demonstrated that the new hybrid approach produced meaningful results. The implementations showed that the new hybrid sentiment classification was able to improve the accuracy performance from the existing baseline sentiment classification methods by 76.55, 71.62 and 74.24%, respectively. 相似文献
2.
Efstratios Kontopoulos Christos Berberidis Theologos Dergiades Nick Bassiliades 《Expert systems with applications》2013,40(10):4065-4074
The emergence of Web 2.0 has drastically altered the way users perceive the Internet, by improving information sharing, collaboration and interoperability. Micro-blogging is one of the most popular Web 2.0 applications and related services, like Twitter, have evolved into a practical means for sharing opinions on almost all aspects of everyday life. Consequently, micro-blogging web sites have since become rich data sources for opinion mining and sentiment analysis. Towards this direction, text-based sentiment classifiers often prove inefficient, since tweets typically do not consist of representative and syntactically consistent words, due to the imposed character limit. This paper proposes the deployment of original ontology-based techniques towards a more efficient sentiment analysis of Twitter posts. The novelty of the proposed approach is that posts are not simply characterized by a sentiment score, as is the case with machine learning-based classifiers, but instead receive a sentiment grade for each distinct notion in the post. Overall, our proposed architecture results in a more detailed analysis of post opinions regarding a specific topic. 相似文献
3.
4.
Multimedia Tools and Applications - Sentiment analysis refers to the interpretation and computational study of emotions, opinions and appraisals within the text data using text analysis methods. A... 相似文献
5.
中文文本情感分析研究综述 总被引:3,自引:0,他引:3
对中文文本情感分析的研究进行了综述。将情感分类划分为信息抽取和情感识别两类任务,并分别介绍了各自的研究进展;总结了情感分析的应用现状,最后提出了存在的问题及不足。 相似文献
6.
郑安怡 《计算机工程与应用》2015,51(21):30-35
文本情感分析领域内的特征加权一般考虑两个影响因子:特征在文档中的重要性(ITD)和特征在表达情感上的重要性(ITS)。结合该领域内两种分类准确率较高的监督特征加权算法,提出了一种新的ITS算法。新算法同时考虑特征在一类文档集里的文档频率(在特定的文档集里,出现某个特征的文档数量)及其占总文档频率的比例,使主要出现且大量出现在同一类文档集里的特征获得更高的ITS权值。实验证明,新算法能提高文本情感分类的准确率。 相似文献
7.
8.
Applied Intelligence - Stock Price Prediction is one of the hot research topics in financial engineering, influenced by economic, social, and political factors. In the present stock market, the... 相似文献
9.
Sutoyo Edi Rifai Achmad Pratama Risnumawan Anhar Saputra Muhardi 《Multimedia Tools and Applications》2022,81(5):6413-6431
Multimedia Tools and Applications - The National Examination (UN) is a system of evaluation of education standards for elementary and secondary schools conducted nationally and is also used to... 相似文献
10.
The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations. 相似文献
11.
Jinyan Li Simon Fong Yan Zhuang Richard Khoury 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2016,20(9):3411-3420
Sentiment analysis in text mining is a challenging task. Sentiment is subtly reflected by the tone and affective content of a writer’s words. Conventional text mining techniques, which are based on keyword frequencies, usually run short of accurately detecting such subjective information implied in the text. In this paper, we evaluate several popular classification algorithms, along with three filtering schemes. The filtering schemes progressively shrink the original dataset with respect to the contextual polarity and frequent terms of a document. We call this approach “hierarchical classification”. The effects of the approach in different combination of classification algorithms and filtering schemes are discussed over three sets of controversial online news articles where binary and multi-class classifications are applied. Meanwhile we use two methods to test this hierarchical classification model, and also have a comparison of the two methods. 相似文献
12.
方面级情感分析旨在分类出文本在不同方面的情感倾向。在长文本的方面级情感分析中,由于长文本存在的冗余和噪声问题,导致现有的方面级情感分析算法对于长文本中方面相关信息的特征提取不够充分,分类不精准;而在方面分层为粗粒度和细粒度方面的数据集上,现有的解决方案没有利用粗粒度方面中的信息。针对以上问题,提出基于文本筛选和改进BERT的算法TFN+BERT-Pair-ATT。该算法首先利用长短时记忆网络(LSTM)和注意力机制相结合的文本筛选网络(TFN)从长文本中直接筛选出与粗粒度方面相关的部分语句;然后将部分语句按次序进行组合,并与细粒度方面相结合输入至在BERT上增加注意力层的BERT-Pair-ATT中进行特征提取;最后使用Softmax进行情感分类。通过与基于卷积神经网络(CNN)的GCAE(Gated Convolutional Network with Aspect Embedding)、基于LSTM的交互式注意力模型(IAN)等经典模型相比,该算法在验证集上的相关评价指标分别提高了3.66%和4.59%,与原始BERT模型相比提高了0.58%。实验结果表明,基于文本筛选和改进BERT的算法在长文本方面级情感分析任务中具有较大的价值。 相似文献
13.
方面级情感分析旨在分类出文本在不同方面的情感倾向。在长文本的方面级情感分析中,由于长文本存在的冗余和噪声问题,导致现有的方面级情感分析算法对于长文本中方面相关信息的特征提取不够充分,分类不精准;而在方面分层为粗粒度和细粒度方面的数据集上,现有的解决方案没有利用粗粒度方面中的信息。针对以上问题,提出基于文本筛选和改进BERT的算法TFN+BERT-Pair-ATT。该算法首先利用长短时记忆网络(LSTM)和注意力机制相结合的文本筛选网络(TFN)从长文本中直接筛选出与粗粒度方面相关的部分语句;然后将部分语句按次序进行组合,并与细粒度方面相结合输入至在BERT上增加注意力层的BERT-Pair-ATT中进行特征提取;最后使用Softmax进行情感分类。通过与基于卷积神经网络(CNN)的GCAE(Gated Convolutional Network with Aspect Embedding)、基于LSTM的交互式注意力模型(IAN)等经典模型相比,该算法在验证集上的相关评价指标分别提高了3.66%和4.59%,与原始BERT模型相比提高了0.58%。实验结果表明,基于文本筛选和改进BERT的算法在长文本方面级情感分析任务中具有较大的价值。 相似文献
14.
目前的情感分析研究大部分仅局限于能够明显地表达意见的主观性文本,却没有对一些隐含地表达情感的文本进行分析.针对这一不足,提出一种基于条件随机场(CRFs)模型的意见挖掘中维吾尔语文本隐式情感分析方法.利用互信息(MI)衡量上下文的依赖度,结合词法、语境依赖词、标点符号和习语等特征用于隐式情感分析.在特征选择时,通过对信息增益(IG)进行改进,解决语料中数据集不平衡的问题.该方法用于维吾尔语文本隐式情感分析的准确率为77.11%,召回率为78.37%,表明了其在意见挖掘中隐式情感分析任务上的有效性. 相似文献
15.
方面级别的文本情感分析旨在针对一个句子中具体的方面单词来判断其情感极性.针对方面单词可能由多个单词组成、平均化所有单词的词向量容易导致语义错误或混乱,不同的文本单词对于方面单词的情感极性判断具有不同的影响力的问题,提出一种融合左右的双边注意力机制的方面级别的文本情感分析模型.首先,设计内部注意力机制来处理方面单词,并根据方面单词和上下文单词设计了双边交互注意力机制,最后将双边交互注意力的处理结果与方面单词处理值三个部分级联起来进行分类.模型在SemEval 2014中两个数据集上进行了实验,分别实现了81.33%和74.22%的准确率,相比较于机器学习和结合注意力机制的各种模型取得了更好的效果. 相似文献
16.
目前情感分析模型通常使用word2vec、GloVe等方法生成静态词向量,并且传统的卷积或循环深度模型无法完整地关注上下文,提取特征不充分,影响情感判断.针对上述问题,提出基于ELMo(embedding from lan-guage model)和双向自注意力网络(bidirectional self-attention network,Bi-SAN)的中文文本情感分析模型.首先通过ELMo语言模型训练得到融合词语本身和上下文信息的词向量,解决了一词多义的问题;同时使用预训练的skip-gram算法代替随机初始化的ELMo模型的嵌入层,提高模型的收敛速度;之后使用Bi-SAN提取特征,由于自注意力机制,Bi-SAN可以完整地关注每个词的上下文,提取特征更为全面.同现有的多个情感分析模型对比,该模型在酒店评论数据集上和NLPCC2014 task2中文数据集取得了更高的F1值,验证了模型的有效性. 相似文献
17.
18.
在文本情感分析时,使用有监督的机器学习方法的不足是需要大量的带标签的文本数据,而无监督的文本聚类方法可以克服这一问题。对于文本情感聚类,在节省数据资源的同时,也存在聚类结果的不确定性问题。给出了情感维度的形式化描述,并将观点词识别技术应用于情感维度的判别中。在此基础上,利用获得的情感维度,对评论文本进行情感聚类,有效地解决情感聚类结果的不确定性问题。在4个领域的英文产品评论数据上进行实验,结果表明该方法在自动识别情感聚类维度中是有效的,并得到了满意的情感聚类结果。 相似文献
19.
This paper describes a support vector machine-based approach to different tasks related to sentiment analysis in Twitter for Spanish. We focus on parameter optimization of the models and the combination of several models by means of voting techniques. We evaluate the proposed approach in all the tasks that were defined in the five editions of the TASS workshop, between 2012 and 2016. TASS has become a framework for sentiment analysis tasks that are focused on the Spanish language. We describe our participation in this competition and the results achieved, and then we provide an analysis of and comparison with the best approaches of the teams who participated in all the tasks defined in the TASS workshops. To our knowledge, our results exceed those published to date in the sentiment analysis tasks of the TASS workshops. 相似文献
20.
《Expert systems with applications》2014,41(7):3506-3513
Term weighting is a strategy that assigns weights to terms to improve the performance of sentiment analysis and other text mining tasks. In this paper, we propose a supervised term weighting scheme based on two basic factors: Importance of a term in a document (ITD) and importance of a term for expressing sentiment (ITS), to improve the performance of analysis. For ITD, we explore three definitions based on term frequency. Then, seven statistical functions are employed to learn the ITS of each term from training documents with category labels. Compared with the previous unsupervised term weighting schemes originated from information retrieval, our scheme can make full use of the available labeling information to assign appropriate weights to terms. We have experimentally evaluated the proposed method against the state-of-the-art method. The experimental results show that our method outperforms the method and produce the best accuracy on two of three data sets. 相似文献