首页 | 官方网站   微博 | 高级检索  
     

基于双语信息和标签传播算法的中文情感词典构建方法
引用本文:李寿山,李逸薇,黄居仁,苏 艳.基于双语信息和标签传播算法的中文情感词典构建方法[J].中文信息学报,2013,27(6):75-82.
作者姓名:李寿山  李逸薇  黄居仁  苏 艳
作者单位:1. 苏州大学 计算机科学与技术学院,江苏 苏州 215006;
2. 香港理工大学 中文及双语学系,香港
基金项目:香港GRF项目(543810);国家自然科学基金资助项目(61003155,61273320)
摘    要:文本情感分析是目前自然语言处理领域的一个热点研究问题,具有广泛的实用价值和理论研究意义。情感词典构建则是文本情感分析的一项基础任务,即将词语按照情感倾向分为褒义、中性或者贬义。然而,中文情感词典构建存在两个主要问题 1)许多情感词存在多义、歧义的现象,即一个词语在不同语境中它的语义倾向也不尽相同,这给词语的情感计算带来困难;2)由国内外相关研究现状可知,中文情感字典建设的可用资源相对较少。考虑到英文情感分析研究中存在大量语料和词典,该文借助机器翻译系统,结合双语言资源的约束信息,利用标签传播算法(LP)计算词语的情感信息。在四个领域的实验结果显示我们的方法能获得一个分类精度高、覆盖领域语境的中文情感词典。

关 键 词:情感分析  双语信息  情感字典  标签传播  

Construction of Chinese Sentiment Lexicon using Bilingual Information and Label Propagation Algorithm
LI Shoushan,LEE Sophia Yat Mei,HUANG Chu-Ren,SU Yan.Construction of Chinese Sentiment Lexicon using Bilingual Information and Label Propagation Algorithm[J].Journal of Chinese Information Processing,2013,27(6):75-82.
Authors:LI Shoushan  LEE Sophia Yat Mei  HUANG Chu-Ren  SU Yan
Affiliation:1. School of Computer Sciences and Technology, Soochow University, Suzhou, Jiangsu 215006, China;2. Department of Chinese and Bilingual Studies, the Hong Kong Polytechnic University, Hong Kong, China
Abstract:Currently, sentiment analysis has become a hot research topic in the natural language processing (NLP) field as it is highly valuable for many practice usages and theory studies. One basic task in sentiment analysis, named the construction of sentiment lexicon, aims to classify one word into positive, neutral or negative according to its sentimental orientation. However, there are two major challenges1) Chinese words are very ambiguities, which makes it hard to compute the sentimental orientation of a word; 2) Given the related research on sentiment analysis, available resource for constructing Chinese sentiment lexicons remains few. Note that there are several corpus and lexicons in English sentiment analysis. In this study, we first use machine translation system with bilingual resources, i.e., English and Chinese information, then get the sentiment orientation of Chinese words by the label propagation algorithm. Experiment results across four domains demonstrate that the lexicon generated with our approach reach an excellent precision and could cover domain information effectively.
Key wordssentiment analysis; bilingual; sentiment lexicon; label propagation algorithm
Keywords:sentiment analysis  bilingual  sentiment lexicon  label propagation algorithm  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号