首页 | 官方网站   微博 | 高级检索  
     

基于变分自编码器的谣言立场分类算法
引用本文:郭奉琦,孟凡荣,王志晓. 基于变分自编码器的谣言立场分类算法[J]. 计算机工程, 2022, 48(2): 99-105. DOI: 10.19678/j.issn.1000-3428.0060194
作者姓名:郭奉琦  孟凡荣  王志晓
作者单位:中国矿业大学 计算机科学与技术学院, 江苏 徐州 221116
基金项目:国家自然科学基金(61876186);
摘    要:针对当前谣言检测任务中社交媒体推特平台的推文数据分布复杂且不均衡的特点,提出基于变分自编码器(VAE)的谣言立场分类算法VAE-LSTM。对数据进行预处理后,利用word2vec模型提取推文词向量并输入VAE中进行训练,得到符合简单概率分布的深度特征序列再从中采样获取有效特征,以避免数据量较大的推文类别影响特征向量。在此基础上,使用长短时记忆(LSTM)网络处理向量序列数据进而实现分类。理论分析和实验结果表明,VAE-LSTM算法无须手动提取或添加特征,训练过程简单高效,同时能缓解类间不平衡问题,其应用于实际场景准确率和F1得分分别为0.800和0.494,与时序注意力机制算法、Turing算法、霍克斯过程算法等相比分类性能更好,且较SVM等早期机器学习方法节省了大量数据预处理时间。

关 键 词:变分自编码器  长短时记忆网络  社交网络  谣言立场  深度特征  
收稿时间:2020-12-04
修稿时间:2021-01-27

Rumor Stance Classification Algorithm Based on Variational Auto-Encoder
GUO Fengqi,MENG Fanrong,WANG Zhixiao. Rumor Stance Classification Algorithm Based on Variational Auto-Encoder[J]. Computer Engineering, 2022, 48(2): 99-105. DOI: 10.19678/j.issn.1000-3428.0060194
Authors:GUO Fengqi  MENG Fanrong  WANG Zhixiao
Affiliation:College of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
Abstract:As a type of social media content, Tweet data in Twitter platform is complex and uneven in distribution, causing difficulties to rumor classification tasks.To address the problem, a Variational Auto-Encoder(VAE)-based algorithm named VAE-LSTM is proposed for rumor stance classification.The data is preprocessed first, and then the word2vec model is used to extract the word vector of the tweet and input it into the VAE for training.In this process, a deep feature sequence that conforms to simple probability distribution is generated, and then sampled to obtain effective features, which can prevent the Tweet category with enormous data from incluencing the feature vector.On this basis, a Long Short-Term Memory(LSTM) network is used to process vector sequence data to implement classification.Results of theoretical analysis and experiments show that the VAE-LSTM algorithm requires no manual intervention in extracting or adding features, making the training process simple and efficient.Furthermore, it can alleviate the imbalance between classes.In actual scenarios, VAE-LSTM displays an accuracy of 0.800 and F1 score of 0.494, outperforming the temporal attention mechanism algorithm, Turing algorithm, and Hawkes Process(HP) algorithm.Furthermore, it saves a lot of data preprocessing time compared with SVM and other early machine learning methods.
Keywords:Variational Auto-Encoder(VAE)  Long Short-Term Memory(LSTM)network  social network  rumor stance  deep feature
本文献已被 维普 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号