观点句中评价对象/属性的缺省项识别方法研究 Identify the Default for Comment Object/Attribute for Opinion Sentence期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

观点句中评价对象/属性的缺省项识别方法研究

引用本文：	刘慧慧,王素格,赵策力. 观点句中评价对象/属性的缺省项识别方法研究[J]. 中文信息学报, 2014, 28(6): 175-182

作者姓名：	刘慧慧王素格赵策力

作者单位：	1. 山西大学计算机与信息技术学院,山西太原 030006; 2. 山西大学计算智能与中文信息处理教育部重点实验室,山西太原 030006; 3. 山西大学数学科学学院,山西太原 030006

基金项目：	国家自然科学基金(61175067, 61272095); 山西省科技攻关项目(20110321027-02); 山西省回国留学人员科研项目(2013-014)

摘要：	在多对象、多属性的评论文本中,评价对象和评价属性的缺省识别对于观点挖掘有着重要的作用。针对情感观点句中评价对象和评价属性的缺省问题,该文提出一种有效的缺省项识别方法。首先构造缺省项识别规则集,用于获取待识别的缺省项侯选集;将缺省项识别问题看作一个二元分类问题,选用词法和依存句法作为特征,使用决策树分类算法C4.5训练分类器模型,在测试集上对待识别的缺省项进行判别。实验结果表明,使用依存句法特征集分类的F值优于词法特征集约2%。将词法和依存句法两类特征融合与单类特征相比,分类精确率和F值分别提高了10%和5%左右,说明词法特征和依存句法特征的融合有利于缺省项识别。
关键词：	缺省项识别规则词法特征依存句法 C4.5算法
Identify the Default for Comment Object/Attribute for Opinion Sentence

LIU Huihui,WANG Suge,ZHAO Celi. Identify the Default for Comment Object/Attribute for Opinion Sentence[J]. Journal of Chinese Information Processing, 2014, 28(6): 175-182

Authors:	LIU Huihui WANG Suge ZHAO Celi

Affiliation:	1. School of Computer & Information Technology, Shanxi University, Taiyuan, Shanxi 030006, China; 2. MOE Key Laboratory of Computational Intelligence and Chinese Information Processing, Shanxi University, Taiyuan, Shanxi 030006, China; 3. School of Mathematics Science, Shanxi University, Taiyuan, Shanxi 030006, China

Abstract:	The identification of the default for comment object and coment attribute for opinion mining is important on multi objects, multi attributes review texts. This paper proposes a new method to deal with this issue. At first, the rule set of default item identification is constructed to obtain the candidate set of recognized default item. We treat the identification of the default item as a binary classification problem, and select the lexical and dependency parsing features. We employ the decision tree C4.5 algorithm to train classification model which was used to judge the recognized default item on the testing data. Experimental results show that the F-value of the classification of the dependency syntactic feature set is superior to the lexical feature set about 2%. Compared with the single feature, the accuracy and F-value of the integrating of two feature sets of lexical and dependency parsing increase up to 10% and 5%, respectively.

Keywords:	default item identification rule lexical feature dependency syntactic C4.5 algorithm
本文献已被 CNKI 等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏