首页 | 官方网站   微博 | 高级检索  
     

改进在线词对主题模型的微博热点话题演化
引用本文:吴迪,张梦甜,生龙,黄竹韵,顾明星. 改进在线词对主题模型的微博热点话题演化[J]. 计算机工程与应用, 2021, 57(24): 179-184. DOI: 10.3778/j.issn.1002-8331.2007-0151
作者姓名:吴迪  张梦甜  生龙  黄竹韵  顾明星
作者单位:河北工程大学 信息与电气工程学院,河北 邯郸 056038
摘    要:话题演化分析是舆情监控的研究热点之一,面向微博热点话题进行演化分析,对于网络用户以及网络监管部门都有很重要的现实意义。针对在线词对主题模型(On-line Biterm Topic Model,OBTM)新旧主题混合、冗余词概率相对较高的问题,对OBTM进行改进,提出基于话题标签和先验参数的OBTM模型(Topic Labels and Prior Parameters OBTM,LPOBTM)。根据微博热点话题的话题标签,将微博文本集区分为含话题标签和不含话题标签的两类数据集,并设置不同的文档-主题先验参数;在前一时间片文档-主题概率分布的基础上,借鉴Sigmod函数对所有主题进行强度排名,从而优化当前时间片上主题-词分布的先验参数计算方法。实验结果表明,LPOBTM能够更准确地描述话题的内容演化情况,并且有更低的模型困惑度。

关 键 词:话题标签  先验参数  主题强度排名  在线词对主题模型  微博热点话题演化  

Microblog Hot Topic Evolution Based on Improved On-Line Biterm Topic Model
WU Di,ZHANG Mengtian,SHENG Long,HUANG Zhuyun,GU Mingxing. Microblog Hot Topic Evolution Based on Improved On-Line Biterm Topic Model[J]. Computer Engineering and Applications, 2021, 57(24): 179-184. DOI: 10.3778/j.issn.1002-8331.2007-0151
Authors:WU Di  ZHANG Mengtian  SHENG Long  HUANG Zhuyun  GU Mingxing
Affiliation:College of Information and Electrical Engineering, Hebei University of Engineering, Handan, Hebei 056038, China
Abstract:Topic evolution analysis is one of the research hotspots of public opinion monitoring. The evolution analysis of microblog hot topics is of great practical significance to network users and network regulators. To solve the problem of OBTM topic mixing and high probability of redundant words, the OBTM based on topic labels and prior parameters (LPOBTM) is proposed in this paper. According to the topic labels, the microblog text set is divided into two types of data sets with and without topic labels. Different document-topic prior parameters are set. Based on document-topic probability distribution in the previous time slice, the intensity ranking of all topics is carried out by drawing lessons from the Sigmod function. Thus, the prior parameter calculation method of topic-word distribution on current time slice is optimized. The experimental results show that LPOBTM can describe the content evolution of topics more accurately, and has lower model perplexity.
Keywords:topic label  prior parameter  topic intensity ranking  On-line Biterm Topic Model(OBTM)  microblog hot topic evolution  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号