首页 | 官方网站   微博 | 高级检索  
     

基于特征选择和深度信念网络的文本情感分类算法
引用本文:向进勇,杨文忠,吾守尔·斯拉木.基于特征选择和深度信念网络的文本情感分类算法[J].计算机应用,2019,39(7):1942-1947.
作者姓名:向进勇  杨文忠  吾守尔·斯拉木
作者单位:新疆大学信息科学与工程学院,乌鲁木齐830046;新疆多语种信息技术重点实验室(新疆大学),乌鲁木齐830046;新疆大学信息科学与工程学院,乌鲁木齐,830046
基金项目:国家自然科学基金资助项目(U1603115,XJEDU2017T002,U1435215)。
摘    要:由于人类语言的复杂性,文本情感分类算法大多都存在因为冗余而造成的词汇量过大的问题。深度信念网络(DBN)通过学习输入语料中的有用信息以及它的几个隐藏层来解决这个问题。然而对于大型应用程序来说,DBN是一个耗时且计算代价昂贵的算法。针对这个问题,提出了一种半监督的情感分类算法,即基于特征选择和深度信念网络的文本情感分类算法(FSDBN)。首先使用特征选择方法(文档频率(DF)、信息增益(IG)、卡方统计(CHI)、互信息(MI))过滤掉一些不相关的特征从而使词汇表的复杂性降低;然后将特征选择的结果输入到DBN中,使得DBN的学习阶段更加高效。将所提算法应用到中文以及维吾尔语中,实验结果表明在酒店评论数据集上,FSDBN在准确率方面比DBN提高了1.6%,在训练时间上比DBN缩短一半。

关 键 词:深度信念网络  深度学习  特征选择  半监督的情感分类算法  受限波尔兹曼机  文本情感分类
收稿时间:2018-11-28
修稿时间:2018-12-28

Text sentiment classification algorithm based on feature selection and deep belief network
XIANG Jinyong,YANG Wenzhong,SILAMU&#,Wushouer.Text sentiment classification algorithm based on feature selection and deep belief network[J].journal of Computer Applications,2019,39(7):1942-1947.
Authors:XIANG Jinyong  YANG Wenzhong  SILAMU&#  Wushouer
Affiliation:1. School of Information Science and Engineering, Xinjiang University, Urumuqi Xinjiang 830046, China;
2. Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, Urumqi Xinjiang 830046, China
Abstract:Because of the complexity of human language, text sentiment classification algorithms mostly have the problem of excessively huge vocabulary due to redundancy. Deep Belief Network (DBN) can solve this problem by learning useful information in the input corpus and its hidden layers. However, DBN is a time-consuming and computationally expensive algorithm for large applications. Aiming at this problem, a semi-supervised sentiment classification algorithm called text sentiment classification algorithm based on Feature Selection and Deep Belief Network (FSDBN) was proposed. Firstly, the feature selection methods including Document Frequency (DF), Information Gain (IG), CHI-square statistics (CHI) and Mutual Information (MI) were used to filter out some irrelevant features to reduce the complexity of vocabulary. Then, the results of feature selection were input into DBN to make the learning phase of DBN more efficient. The proposed algorithm was applied to Chinese and Uygur language. The experimental results on hotel review dataset show that the accuracy of FSDBN is 1.6% higher than that of DBN and the training time of FSDBN halves that of DBN.
Keywords:Deep Belief Network (DBN)  Deep Learning (DL)  Feature Selection (FS)  semi-supervised sentiment classification algorithm  Restricted Boltzmann Machine (RBM)  text sentiment classification  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号