首页 | 官方网站   微博 | 高级检索  
     

主题特征格分析:一种用户生成文本质量评估方法
引用本文:钟将,张淑芳,郭卫丽,李雪.主题特征格分析:一种用户生成文本质量评估方法[J].电子学报,2018,46(9):2201-2206.
作者姓名:钟将  张淑芳  郭卫丽  李雪
作者单位:1. 信息物理社会可信服务计算教育部重点实验室, 重庆大学, 重庆 400030; 2. 重庆大学计算机学院, 重庆 400030; 3. 重庆电子工程职业学院, 重庆 401331; 4. 昆士兰大学信息技术与电子工程学院, 布里斯班, 澳大利亚 4072
摘    要:本文设计了一种用户生成文本的质量分析框架.首先,基于主题分析构建商品类别主题特征集合.其次,利用主题特征与商品分类的强关联关系,构建形式化概念分析的形式背景,将分类-主题概念格化简并生成主题特征格,以此构建五个质量特征并生成质量评估模型.最后,在真实评论数据上的实验结果表明新方法具有更高预测精度.

关 键 词:用户评论  质量评估  主题特征  主题特征格  
收稿时间:2017-06-05

TFLA: A Quality Analysis Framework for User Generated Contents
ZHONG Jiang,ZHANG Shu-fang,GUO Wei-li,LI Xue.TFLA: A Quality Analysis Framework for User Generated Contents[J].Acta Electronica Sinica,2018,46(9):2201-2206.
Authors:ZHONG Jiang  ZHANG Shu-fang  GUO Wei-li  LI Xue
Affiliation:1. Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education, Changqing University, Chongqing 400030, China; 2. College of Computer Science, Chongqing University, Chongqing 400030, China; 3. Chongqing College of Electronic Engineering, Chongqing 401331, China; 4. School of Information Technology and Electrical Engineering, University of Queensland, Brisbane 4072, Australia
Abstract:In this paper,we design a topic-features lattices analysis (TFLA) framework based on objectivity quality dimensions.Firstly,we apply the latent Dirichlet allocation (LDA) approach to get latent topics as topic-features for each goods categories.Secondly,we construct formal background based on the strong relationship between goods categories and topic-features.So we could get generalization and instantiation relationship among the topic-features through formal concept analysis (FCA).We employ domain knowledge and relationships among topic-features to define five objective quality features.Also,we use machine learning methods to build quality evaluation models based on these quality features.Experiment results on actual comment data sets show that our new quality models' prediction results are in agreement with the artificial quality tags in most cases.The best performances could get that the mean absolute error (MAE) is 0.7 and F-measure is 0.5,which is significantly better than the conventional quality prediction model based on support vector machine (SVM) classification.
Keywords:user comment  data quality  topic features  lattices of topic-features  
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号