首页 | 官方网站   微博 | 高级检索  
     

基于注意力机制的改进CLSM检索式匹配问答方法
引用本文:于重重,曹帅,潘博,张青川,徐世璇.基于注意力机制的改进CLSM检索式匹配问答方法[J].计算机应用,2019,39(4):972-976.
作者姓名:于重重  曹帅  潘博  张青川  徐世璇
作者单位:北京工商大学计算机与信息工程学院,北京,100048;中国社会科学院民族学与人类学研究所,北京,100081
基金项目:教育部人文社会科学研究规划基金资助项目(16YJAZH072);国家社会科学基金重大项目(14ZDB156)。
摘    要:针对检索式匹配问答模型对中文语料适应性弱和句子语义信息被忽略的问题,提出一种基于卷积神经网络潜在语义模型(CLSM)的中文文本语义匹配模型。首先,在传统CLSM基础上进行改进,去掉单词和字母的N元模型层,以增强模型对中文语料的适应性;其次,采用注意力机制算法,针对输入的中文词向量信息建立实体关注层模型,以加强句中核心词的权重信息;最后,通过卷积神经网络(CNN)有效地捕获输入句子上下文结构方面信息,并通过池化层对获取的语义信息进行降维。基于医疗问答对数据集,将改进模型与传统语义模型、传统翻译模型、深度神经网络模型进行对比,实验结果显示所提模型在归一化折现累积增益(NDCG)方面有4~10个百分点的提升,优于对比模型。

关 键 词:潜在语义模型  注意力机制  检索式匹配问答
收稿时间:2018-08-15
修稿时间:2018-11-12

Retrieval matching question and answer method based on improved CLSM with attention mechanism
YU Chongchong,CAO Shuai,PAN Bo,ZHANG Qingchuan,XU Shixuan.Retrieval matching question and answer method based on improved CLSM with attention mechanism[J].journal of Computer Applications,2019,39(4):972-976.
Authors:YU Chongchong  CAO Shuai  PAN Bo  ZHANG Qingchuan  XU Shixuan
Affiliation:1. School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China;2. Institute of Ethnology and Anthropology, Chinese Academy of Social Sciences, Beijing 100081, China
Abstract:Focusing on the problem that the Retrieval Matching Question and Answer (RMQA) model has weak adaptability to Chinese corpus and the neglection of semantic information of the sentence, a Chinese text semantic matching model based on Convolutional neural network Latent Semantic Model (CLSM) was proposed. Firstly, the word-N-gram layer and letter-N-gram layer of CLSM were removed to enhance the adaptability of the model to Chinese corpus. Secondly, with the focus on vector information of input Chinese words, an entity attention layer model was established based on the attention mechanism algorithm to strengthen the weight information of the core words in sentence. Finally, Convolutional Neural Network (CNN) was used to capture the input sentence context structure information effectively and the pool layer was used to reduce the dimension of semantic information. In the experiments based on a medical question and answer dataset, compared with the traditional semantic models, traditional translation models and deep neural network models, the proposed model has 4-10 percentage points improvement in Normalized Discount Cumulative Gain (NDCG).
Keywords:Convolutional Latent Semantic Model (CLSM)                                                                                                                        attention mechanism                                                                                                                        Retrieval Matching Question and Answer (RMQA)
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号