首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 1 毫秒
1.
吴璠  王中卿  周夏冰  周国栋 《软件学报》2020,31(8):2492-2507
情感分析旨在判断文本的情感倾向,而评论质量检测旨在判断评论的质量.情感分析和评论质量检测是情感分析中两个关键的任务,这两个任务受多种因素的影响而密切相关,同一个产品的情感倾向具有相似的情感极性;同时,同一个用户发表的评论质量也具有一定的相似性.因此,为了更好地研究情感分类和评论质量检测任务的相关性以及用户信息和产品信息分别对情感分类和评论质量检测的影响,提出了一个情感分析和评论质量检测联合模型.首先,使用深度学习方法学习评论的文本信息作为联系两个任务的基础;然后,将用户评论及产品评论作为用户的表示和产品的表示;在此基础上,采用用户注意力机制对用户的表示进行编码,采用产品注意力机制对产品的表示进行编码;最后,将用户表示和产品表示结合起来进行情感分析和评论质量检测.通过在Yelp2013和Yelp2015数据集上的实验结果表明,该模型与现有的神经网络模型相比,能够有效地提高情感分析和在线评论质量检测的性能.  相似文献   

2.
Sentiment analysis involves the detection of sentiment content of text using natural language processing. Natural language processing is a very challenging task due to syntactic ambiguities, named entity recognition, use of slangs, jargons, sarcasm, abbreviations and contextual sensitivity. Sentiment analysis can be performed using supervised as well as unsupervised approaches. As the amount of data grows, unsupervised approaches become vital as they cut down on the learning time and the requirements for availability of a labelled dataset. Sentiment lexicons provide an easy application of unsupervised algorithms for text classification. SentiWordNet is a lexical resource widely employed by many researchers for sentiment analysis and polarity classification. However, the reported performance levels need improvement. The proposed research is focused on raising the performance of SentiWordNet3.0 by using it as a labelled corpus to build another sentiment lexicon, named Senti‐CS. The part of speech information, usage based ranks and sentiment scores are used to calculate Chi‐Square‐based feature weight for each unique subjective term/part‐of‐speech pair extracted from SentiWordNet3.0. This weight is then normalized in a range of ?1 to +1 using min–max normalization. Senti‐CS based sentiment analysis framework is presented and applied on a large dataset of 50000 movie reviews. These results are then compared with baseline SentiWordNet, Mutual Information and Information Gain techniques. State of the art comparison is performed for the Cornell movie review dataset. The analyses of results indicate that the proposed approach outperforms state‐of‐the‐art classifiers.  相似文献   

3.
在文本情感分析研究中,一条评论分别包含了篇章级、句子级和词语级等不同粒度的语义信息,而不同的词和句子在情感分类中所起的作用也是不同的,直接使用整条评论进行建模的情感分析方法则过于粗糙,同时也忽略了表达情感的用户信息和被评价的产品信息。针对该问题,提出一种基于多注意力机制的层次神经网络模型。该模型分别从词语级别、句子级别和篇章级别获取语义信息,并分别在句子级和篇章级引入基于用户和商品的注意力机制来计算不同句子和词的重要性。最后通过三个公开数据集进行测试验证,实验结果表明,基于多注意力层次神经网络的文本情感分析方法较其他模型性能有显著的提升。  相似文献   

4.
基于监督学习的中文情感分类技术比较研究   总被引:6,自引:0,他引:6  
情感分类是一项具有较大实用价值的分类技术,它可以在一定程度上解决网络评论信息杂乱的现象,方便用户准确定位所需信息。目前针对中文情感分类的研究相对较少,其中各种有监督学习方法的分类效果以及文本特征表示方法和特征选择机制等因素对分类性能的影响更是亟待研究的问题。本文以n-gram以及名词、动词、形容词、副词作为不同的文本表示特征,以互信息、信息增益、CHI统计量和文档频率作为不同的特征选择方法,以中心向量法、KNN、Winnow、Nave Bayes和SVM作为不同的文本分类方法,在不同的特征数量和不同规模的训练集情况下,分别进行了中文情感分类实验,并对实验结果进行了比较,对比结果表明: 采用BiGrams特征表示方法、信息增益特征选择方法和SVM分类方法,在足够大训练集和选择适当数量特征的情况下,情感分类能取得较好的效果。  相似文献   

5.
Sentiment analysis and opinion mining are valuable for extraction of useful subjective information out of text documents. These tasks have become of great importance, especially for business and marketing professionals, since online posted products and services reviews impact markets and consumers shifts. This work is motivated by the fact that automating retrieval and detection of sentiments expressed for certain products and services embeds complex processes and pose research challenges, due to the textual phenomena and the language specific expression variations. This paper proposes a fast, flexible, generic methodology for sentiment detection out of textual snippets which express people’s opinions in different languages. The proposed methodology adopts a machine learning approach with which textual documents are represented by vectors and are used for training a polarity classification model. Several documents’ vector representation approaches have been studied, including lexicon-based, word embedding-based and hybrid vectorizations. The competence of these feature representations for the sentiment classification task is assessed through experiments on four datasets containing online user reviews in both Greek and English languages, in order to represent high and weak inflection language groups. The proposed methodology requires minimal computational resources, thus, it might have impact in real world scenarios where limited resources is the case.  相似文献   

6.
情感分类是一项具有较大实用价值的分类技术.它可以对网上纷繁复杂的信息进行情感倾向标注.为用户提供一个简洁的总结信息,进而为人们制定决策提供帮助,然而目前针对汉语的情感分类开展的工作并不多。提出一种基于SVM机器学习的情感分类方法,并引入基于2-POS模型的句子主观性分析方法,利用SVM进行机器学习,实现汉语评论的情感分类。实验表明这种方法能够有效地判定评论信息的情感倾向。  相似文献   

7.
Recommender systems suggest items that users might like according to their explicit and implicit feedback information, such as ratings, reviews, and clicks. However, most recommender systems focus mainly on the relationships between items and the user’s final purchasing behavior while ignoring the user’s emotional changes, which play an essential role in consumption activity. To address the challenge of improving the quality of recommender services, this paper proposes an emotion-aware recommender system based on hybrid information fusion in which three representative types of information are fused to comprehensively analyze the user’s features: user rating data as explicit information, user social network data as implicit information and sentiment from user reviews as emotional information. The experimental results verify that the proposed approach provides a higher prediction rating and significantly increases the recommendation accuracy.  相似文献   

8.
9.
Nowadays, Big Data, a large volume of both structured and unstructured data, is generated from Social Media. Social Media are powerful marketing tools and social big data can offer the business insights. The major challenge facing social big data is attaining efficient techniques to collect a large volume of social data and extract insights from the huge amount of collected data. Sentiment Analysis of social big data can provide business insights by extracting the public opinions. The traditional analytic platforms need to be scaled up for analyzing a large volume of social big data. Social data are by nature shorter and generally not constructed with proper grammatical rules and hence difficult to achieve high reliable result in Sentiment Analysis. Acquiring effective training data is a challenge, although learning based approaches are good for sentiment classification. Manual Labeling for training data is time and labor consuming. In this paper, Sentiment Analysis system on Big Data Analytics platform is proposed to provide valuable information by analyzing large scale social data in an efficient and timely manner since they have been implemented using a MapReduce framework and a Hadoop distributed storage (HDFS). The proposed Sentiment Analysis system consists of four modules: data collection, data cleaning and preprocessing, class labeling and sentiment classification. The system enables high-level performance of sentiment classification while taking advantage of combining lexicon-based classifier’s effortless setup process and learning based classifier. Twitter stream data is used for system evaluation as the Twitter is widespread Social Media and a good source of information in the sense of snapshots of moods and feelings as well as up-to-date events. The evaluation results show that this system achieve a promising accuracy by 84.2%. Moreover, this system is able to scale up to analyze the large scale data by decreasing the processing time when adding more nodes in the cluster.  相似文献   

10.
基于情感Ontology的资源分析模型   总被引:1,自引:0,他引:1  
对资源分析方法进行了研究,并提出了一种基于情感Ontology的分析方法。首先基于“知网”构建情感Ontology,然后基于情感Ontology抽取资源分析的特征词汇并判断其情感倾向性,最后根据抽取的特征词汇对整篇文本的情感倾向进行分析。实验结果表明,在以人工标注做Baseline的基础上,利用情感Ontology抽取特征词汇的资源分析方法可以使情感识别的准确率达到78.87%。  相似文献   

11.
情感分类是用于判断数据的情感极性,广泛用于商品评论,微博话题等数据。标记信息的昂贵使得传统的情感分类方法难以对不同领域的数据进行有效的分类。为此,跨领域情感分类问题引起广泛关注。已有的跨领域情感分类方法大多以共现为基础提取词汇特征和句法特征, 而忽略了词语间的语义关系。基于此,提出了基于word2vec的跨领域情感分类方法WEEF(Cross-domain Classification based on Word Embedding Extension Feature),选取高质量的领域共现特征作为桥梁,并以这些特征作为种子,基于词向量的相似度计算,将领域专有特征扩充到这些种子中,形成特征簇,从而减小领域间的差异。在SRAA和Amazon产品评论数据集上的实验结果表明方法的有效性,尤其在数据量较大时。  相似文献   

12.
情感分类是一项具有实用价值的分类技术。目前英语和汉语的情感分类的研究比较多,而针对维吾尔语的研究较少。以n-gram模型作为不同的文本表示特征,以互信息、信息增益、CHI统计量和文档频率作为不同的特征选择方法,选择不同的特征数量,以Naǐve Bayes、ME(最大熵)和SVM(支持向量机)作为不同的文本分类方法,分别进行了维吾尔语情感分类实验,并对实验结果进行了比较,结果表明:采用UniGrams特征表示方法、在5 000个特征数量和合适的特征选择函数,ME和SVM对维吾尔语情感分类能取得较好的效果。  相似文献   

13.
情感分类是一项具有较大实用价值的分类技术,它可以在一定程度上解决网络评论信息杂乱的现象,方便用户准确定位所需信息。目前针对中文情感分类的研究相对较少,该文考虑将一些网络评论进行情感分类,判断一篇评论是正面还是反面。文本分类的机器学习方法较多,该文采用支持向量机的方法进行分类。该文特点在于采用具有语意倾向的词并综合其词性作为特征项,采用TF-IDF的值作为特征项权值。实验表明,用这种方法对网上的一些评论进行分类可以达到一个高的准确率。  相似文献   

14.
Journal of Intelligent Information Systems - Sentiment analysis for user reviews has received substantial heed in recent years. There are many deep learning models for natural language processing...  相似文献   

15.
高华玲  张晶 《软件》2021,(1):45-47,66
为研究高端酒店服务中的亮点和不足,分析酒店用户评论舆情,文章对高端酒店用户评论进行情感分析和可视化,提出酒店优势与改进策略。文章采用通用情感词典Hownet与酒店评论相关的评论领域专业词典相结合的方式构建领域情感词典。结合所构建的领域情感词典和其他特殊词典,比如短语词典、否定词词典和副词词典等进行情感分类,然后将分类完成的三个极性的情感词进行词频统计和词云绘制,最后根据词云结果,给出高端酒店在经营策略上的改进建议。  相似文献   

16.
《Pattern recognition》2014,47(2):758-768
Sentiment analysis, which detects the subjectivity or polarity of documents, is one of the fundamental tasks in text data analytics. Recently, the number of documents available online and offline is increasing dramatically, and preprocessed text data have more features. This development makes analysis more complex to be analyzed effectively. This paper proposes a novel semi-supervised Laplacian eigenmap (SS-LE). The SS-LE removes redundant features effectively by decreasing detection errors of sentiments. Moreover, it enables visualization of documents in perceptible low dimensional embedded space to provide a useful tool for text analytics. The proposed method is evaluated using multi-domain review data set in sentiment visualization and classification by comparing other dimensionality reduction methods. SS-LE provides a better similarity measure in the visualization result by separating positive and negative documents properly. Sentiment classification models trained over reduced data by SS-LE show higher accuracy. Overall, experimental results suggest that SS-LE has the potential to be used to visualize documents for the ease of analysis and to train a predictive model in sentiment analysis. SS-LE can also be applied to any other partially annotated text data sets.  相似文献   

17.
Sentiment analysis is the natural language processing task dealing with sentiment detection and classification from texts. In recent years, due to the growth in the quantity and fast spreading of user-generated contents online and the impact such information has on events, people and companies worldwide, this task has been approached in an important body of research in the field. Despite different methods having been proposed for distinct types of text, the research community has concentrated less on developing methods for languages other than English. In the above-mentioned context, the present work studies the possibility to employ machine translation systems and supervised methods to build models able to detect and classify sentiment in languages for which less/no resources are available for this task when compared to English, stressing upon the impact of translation quality on the sentiment classification performance. Our extensive evaluation scenarios show that machine translation systems are approaching a good level of maturity and that they can, in combination to appropriate machine learning algorithms and carefully chosen features, be used to build sentiment analysis systems that can obtain comparable performances to the one obtained for English.  相似文献   

18.
以消费者行为分析和离散选择的相关理论为基础,通过对用户生成内容进行特征粒度的情感分析,同时从产品的客观数据和用户生成的主观内容中提取模型特征,使用有监督的学习训练MNL模型预测产品的消费者剩余作为搜索排序的依据,并实现了手机、笔记本电脑和数码相机类的产品搜索系统。双盲实验表明,该文提出的产品搜索模型搜索效果比基准算法有显著的提高。  相似文献   

19.
现有的主题情感联合(JST)模型能够同时识别文本中的主题和情感,但是现有的JST模型主要是对文本内容建模,没有考虑用户特征,导致情感分析结果出现用户人口统计偏差和行为事件偏差。提出了考虑用户特征的主题情感联合(JUST)模型,JUST模型的主要改进之处在于,将用户特征加入模型,以文档所对应的用户特征的线性函数作为文档-情感分布的先验,由此得到具有不同特征的用户群体的情感倾向。在汽车之家网站(www.autohome.com.cn)的13252条汽车评论数据集上,检验了JUST模型的有效性,实验结果表明,加入用户特征的JUST模型情感分类效果优于JST模型和TSMMF模型,同时比较了汽车之家网站上不同特征用户之间的关注主题情感差异。  相似文献   

20.
Sentiment analysis is one of the fastest growing research areas in computer science, making it challenging to keep track of all the activities in the area. We present a computer-assisted literature review, where we utilize both text mining and qualitative coding, and analyze 6996 papers from Scopus. We find that the roots of sentiment analysis are in the studies on public opinion analysis at the beginning of 20th century and in the text subjectivity analysis performed by the computational linguistics community in 1990’s. However, the outbreak of computer-based sentiment analysis only occurred with the availability of subjective texts on the Web. Consequently, 99% of the papers have been published after 2004. Sentiment analysis papers are scattered to multiple publication venues, and the combined number of papers in the top-15 venues only represent ca. 30% of the papers in total. We present the top-20 cited papers from Google Scholar and Scopus and a taxonomy of research topics. In recent years, sentiment analysis has shifted from analyzing online product reviews to social media texts from Twitter and Facebook. Many topics beyond product reviews like stock markets, elections, disasters, medicine, software engineering and cyberbullying extend the utilization of sentiment analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号