首页 | 官方网站   微博 | 高级检索  
     

基于社交圈层和注意力机制的信息热度预测
引用本文:郑作武,邵斯绮,高晓沨,陈贵海.基于社交圈层和注意力机制的信息热度预测[J].计算机学报,2021,44(5):921-936.
作者姓名:郑作武  邵斯绮  高晓沨  陈贵海
作者单位:上海交通大学计算机科学与工程系 上海 200240
基金项目:国家重点研发项目(2020YFB1707903);国家自然科学基金(61872238,61972254);CCF-腾讯科研基金(RAGR20200105);腾讯广告犀牛鸟专项研究计划(FR202001);华为云项目(TC20201127009)资助.
摘    要:社交网络现已成为现实世界中信息传播与扩散的主要媒介,对其中的热点信息进行建模和预测有着广泛的应用场景和商业价值,比如进行信息传播挖掘、广告推荐和用户行为分析等.目前的相关研究主要利用特征和时间序列进行建模,但是并没有考虑到社交网络中用户的社交圈层对于信息传播的作用.本文提出了一种基于社交圈层和注意力机制的热度预测模型SCAP(Social Circle and Attention based Popularity Prediction),首先对社交圈层进行定义,通过自动编码器提取用户历史文本序列的特征,对不同用户的社交圈层进行聚类划分,得到社交圈层特征.进而对于一条新发布的文本信息,通过长短期记忆网络与嵌入层提取其文本特征、用户特征和时序特征,并基于注意力机制,捕获到不同社交圈层对于该文本信息的影响程度,得到社交圈层注意力特征.最后将文本特征、用户特征、时序特征和社交圈层注意力特征进行特征融合,并通过两个全连接层进行建模学习,对社交信息的热度进行预测.在推特、微博和豆瓣等四个数据集上的实验结果表明,SCAP模型的预测表现相比于多个对比模型总体呈优,在不同数据集上均方误差(MSE)分别降低了0.017,0.022,0.021和0.031,F1分数分别提升0.034,0.021,0.034和0.025,能够较为准确地预测社交信息的热度.本文同时探究了不同实验参数对于模型的影响效果,如用户历史文本序列的数量、社交圈层的数量和时间序列的长度,最后验证了模型输入的各个特征和注意力机制的引入对于模型预测性能提升的有效性,在推特数据集中,引入社交圈层和注意力机制,模型的MSE指标分别降低了0.065和0.019.

关 键 词:社交网络  热度预测  社交圈层  注意力机制  用户偏好

Social Circle and Attention Based Information Popularity Prediction
ZHENG Zuo-Wu,SHAO Si-Qi,GAO Xiao-Feng,CHEN Gui-Hai.Social Circle and Attention Based Information Popularity Prediction[J].Chinese Journal of Computers,2021,44(5):921-936.
Authors:ZHENG Zuo-Wu  SHAO Si-Qi  GAO Xiao-Feng  CHEN Gui-Hai
Affiliation:(Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240)
Abstract:Social network has become the main medium for people to discuss what happens in the real world with the popularity of the Internet.On social platforms,such as Twitter,Weibo,and WeChat,user generated content(UGC)can form a rich data stream and disseminate it,so that users can have immediate insight into the hot information that are happening and the information generated around them.Modeling and predicting hot information on social networks have a wide range of application scenarios and commercial values,such as information dissemination mining,advertising recommendation,and user behavior analysis,which has attracted much attention in the field of data mining and social network analysis.In recent years,methods of popularity prediction mainly can be divided into two categories,including feature based methods and time series based methods,from which it has been extensively studied,such as feature engineering,utilizing stochastic process to model temporal sequence of information.However,these techniques either require a long-term observation of the information or hand-craft features that are expensive to extract.Besides,they fail to consider the role of social circles of users when information is spread on social networks.Different users may care about different information and are affected by different user groups at the same time,while users in the same social circle tend to pay attention to similar information and have closer connections.It will further encourage users in the same circle to influence each other.In this paper,we propose SCAP,a Social Circle and Attention based Popularity prediction model,which includes an autoencoder based social circle detection model and an attention based information popularity prediction model.Firstly,an autoencoder is used to extract user preference from user sequential behavior patterns,which can guide the exploration for different social circles and learning for social circle embedding by the clustering algorithm.Then for a new UGC,text embedding,user embedding,and temporal feature are extracted based on long short-term memory(LSTM)layer and embedding layer.Furthermore,we capture the influence on UGC of different social circles through attention mechanism,which indicates the weight of different social circles.Finally,using social circle attention,text embedding,user embedding,and temporal feature,the popularity of UGC is predicted by two fully connected layers.We conduct extensive experiments on 4 real world datasets from Twitter,Weibo,and Douban Event platforms.The experimental results show that the performance of SCAP is better than baseline methods.We first evaluate the effectiveness from regression and classification perspectives.The mean square error(MSE)of SCAP on different datasets decreases by 0.017,0.022,0.021,and 0.031,respectively,and the F1-score increases by 0.034,0.021,0.034,and 0.025,respectively.We also consider the impacts of different parameters on the model,including the number of users’recent sequences,the number of social circles,the length of temporal sequences.In addition,the effectiveness of different features and attention mechanism are validated in ablation study.For example,in Twitter dataset,the MSE decreases by 0.065 and 0.019 when integrating social circle and attention mechanism,respectively.
Keywords:social networks  popularity prediction  social circle  attention mechanism  user preference
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号