首页 | 官方网站   微博 | 高级检索  
     

基于分类的中文微博热点话题发现方法研究
引用本文:郑飞,张蕾.基于分类的中文微博热点话题发现方法研究[J].信息网络安全,2014(9):127-131.
作者姓名:郑飞  张蕾
作者单位:上海市公安局,上海200025
摘    要:智能手机和微博客户端强化了微博的媒体特性,实时发现微博话题具有现实意义。文章提出了一种基于关键字分类的中文微博热点话题发现方法,通过关键字对微博信息进行筛选和归类,以时间窗内词频和增长速度构造赋权函数提取主题词,词汇的同文本条件概率作为相似度判定依据,基于改进的单遍聚类算法进行主题词聚类。对系统运行结果分析表明,该方法可以实时有效地聚类发现微博热点话题。

关 键 词:分类  微博  话题发现  聚类

Classiifcation-based Hot Topic Detection Approach on Chinese Micro-blog
ZHENG Fei,ZHANG Lei.Classiifcation-based Hot Topic Detection Approach on Chinese Micro-blog[J].Netinfo Security,2014(9):127-131.
Authors:ZHENG Fei  ZHANG Lei
Affiliation:(Shanghai Bureau of Public Security, Shanghai 200025, China)
Abstract:Smart-phones and micro-blog client reinforce the micro-blog media features. Therefore, Micro-blog hot topic real-time detection can provide valuable research results in relevant ifelds. The paper introduces a real-time hot micro-blog topic detection method based on keywords classiifcation. Filtered micro-blog messages were classiifed according to keywords. A multi-weight function based on the word frequency and growth in the time window was used to extract the key words of micro-blog information. An improved single-pass clustering algorithm based on same-text conditional probability was used to ifnd the micro-blog hot topic. The results show that the approach is effect in clustering micro-blog hot topic in real time.
Keywords:classiifcation  micro-blog  topic detection  clustering
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号