首页 | 官方网站   微博 | 高级检索  
     

基于主题和关键词特征的比较文本分类方法
引用本文:丁勇,程家桥,蒋翠清,王钊.基于主题和关键词特征的比较文本分类方法[J].计算机工程与应用,2021,57(17):196-202.
作者姓名:丁勇  程家桥  蒋翠清  王钊
作者单位:1.合肥工业大学 管理学院,合肥 230009 2.过程优化与智能决策教育部重点实验室,合肥 230009
摘    要:比较文本对于企业竞争产品分析至关重要,但目前面向问答领域的比较文本分类研究较少。针对问答文本中比较信息丰富、主题集中的特点,提出了基于主题特征和关键词特征扩展的比较文本分类方法。通过预训练主题模型,推断问答文本的主题概率分布作为其主题特征;针对向量拼接、求和导致关键词信息流失的问题,设计GRU自编码器实现关键词向量特征提取。综合文本主题信息和关键词语义,从语言、产品、情感、社交、主题、关键词角度构建比较文本分类特征,最后使用多种分类器对问答文本进行分类。实验结果表明,构建的特征行之有效,比较文本分类效果较好。

关 键 词:主题模型  自编码器  特征扩展  比较文本分类  

Comparative Text Classification Method Based on Topic and Keyword Feature
DING Yong,CHENG Jiaqiao,JIANG Cuiqing,WANG Zhao.Comparative Text Classification Method Based on Topic and Keyword Feature[J].Computer Engineering and Applications,2021,57(17):196-202.
Authors:DING Yong  CHENG Jiaqiao  JIANG Cuiqing  WANG Zhao
Affiliation:1.School of Management, Hefei University of Technology, Hefei 230009, China 2.Key Laboratory of Process Optimization and Intelligent Decision-making of Ministry of Education, Hefei 230009, China
Abstract:Comparative text is very important for competitive products analysis, but there are few researches on the classification of comparative text in the Q&A field. Aiming at the characteristics of rich information and concentrated topics in Q&A texts, this paper proposes a comparative text classification method based on topic feature and keyword feature expansion. Based on the pretrained topic model, the topic probability distribution of the Q&A text is inferred as its topic feature. In view of the keyword information loss caused by vector concatenation and summation, GRU-autoencoder is designed to realize feature extraction, and the encoder output is used as the keyword feature of Q&A text. Integrating the topic information and keyword semantics, the comparative text features are constructed from the perspectives of linguistics, product, sentiment, social, topic and keyword, then the Q&A text is classified by using various classifiers. The experimental results show that the constructed features are effective and the effect of the classification are better.
Keywords:topic model  autoencoder  feature expansion  comparative text classification  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号