首页 | 官方网站   微博 | 高级检索  
     

融合知识图谱与注意力机制的短文本分类模型
引用本文:丁辰晖,夏鸿斌,刘渊.融合知识图谱与注意力机制的短文本分类模型[J].计算机工程,2021,47(1):94-100.
作者姓名:丁辰晖  夏鸿斌  刘渊
作者单位:江南大学 数字媒体学院,江苏 无锡 214122;江南大学 数字媒体学院,江苏 无锡 214122;江苏省媒体设计与软件技术重点实验室,江苏 无锡 214122;江南大学 数字媒体学院,江苏 无锡 214122;江苏省媒体设计与软件技术重点实验室,江苏 无锡 214122
基金项目:国家科技支撑计划项目;国家自然科学基金
摘    要:针对短文本缺乏上下文信息导致的语义模糊问题,构建一种融合知识图谱和注意力机制的神经网络模型。借助现有知识库获取短文本相关的概念集,以获得短文本相关先验知识,弥补短文本缺乏上下文信息的不足。将字符向量、词向量以及短文本的概念集作为模型的输入,运用编码器-解码器模型对短文本与概念集进行编码,利用注意力机制计算每个概念权重值,减小无关噪声概念对短文本分类的影响,在此基础上通过双向门控循环单元编码短文本输入序列,获取短文本分类特征,从而更准确地进行短文本分类。实验结果表明,该模型在AGNews、Ohsumed和TagMyNews短文本数据集上的准确率分别达到73.95%、40.69%和63.10%,具有较好的分类能力。

关 键 词:短文本分类  知识图谱  自然语言处理  注意力机制  双向门控循环单元

Short Text Classification Model Combining Knowledge Graph and Attention Mechanism
DING Chenhui,XIA Hongbin,LIU Yuan.Short Text Classification Model Combining Knowledge Graph and Attention Mechanism[J].Computer Engineering,2021,47(1):94-100.
Authors:DING Chenhui  XIA Hongbin  LIU Yuan
Affiliation:(School of Digital Media,Jiangnan University,Wuxi,Jiangsu 214122,China;Jiangsu Key Laboratory of Media Design and Software Technology,Wuxi,Jiangsu 214122,China)
Abstract:Concerning the semantic ambiguity caused by the lack of context information,this paper proposes a neural network model,which combines knowledge graph and attention mechanism.By using the existing knowledge base to obtain the concept set related to the short text,the prior knowledge related to the short text is obtained to address the lack of context information in the short text.The character vector,word vector,and concept set of the short text are taken as the input of the model.Then the encoder-decoder model is used to encode the short text and concept set,and the attention mechanism is used to calculate the weight value of each concept to reduce the influence of unrelated noise concepts on short text classification.On this basis,a Bi-directional-Gated Recurrent Unit(Bi-GRU)is used to encode the input sequences of the short text to obtain short text classification features,so as to perform short text classification more effectively.Experimental results show that the accuracy of the model on AGNews,Ohsumed and TagMyNews short text data sets is 73.95%,40.69%and 63.10%,respectively,showing a good classification ability.
Keywords:short text classification  knowledge graph  Natural Language Processing(NLP)  attention mechanism  Bi-directional-Gated Recurrent Unit(Bi-GRU)
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号