首页 | 官方网站   微博 | 高级检索  
     

面向问句复述识别的语义正交化匹配方法研究
引用本文:朱朦朦,武恺莉,洪宇,陈鑫,张民.面向问句复述识别的语义正交化匹配方法研究[J].中文信息学报,2021,35(11):34-42.
作者姓名:朱朦朦  武恺莉  洪宇  陈鑫  张民
作者单位:苏州大学 计算机科学与技术学院,江苏 苏州 215006
基金项目:国家自然科学基金(61672367,61672368);国家重点研发计划(2017YFB1002104);江苏省研究生科研与实践创新计划(SJCX19_0926)
摘    要:问句复述识别任务旨在判断两个自然问句的语义是否等价。问句的语义理解与交互是解决该任务的关键因素。现有工作通常基于问句的语义级编码,通过融合或交互的方式,抽取问句的浅层语义特征,以此支持复述问句之间的语义计算。但是如果能找到两个问句的相同点和不同点,就可以基于这些信息得到更为准确的判断结果。基于此想法,该文提出了语义正交化匹配方法,将语义正交化引入到问句复述识别任务中。通过语义正交化方法将每个问句拆分为与另一个问句的相似表示和差异表示,这不仅丰富了问句的语义表示,而且实现了问句的多粒度特征语义融合。该文在中文数据集LCQMC和英文数据集Quora上进行实验,证明了语义正交化匹配方法在问句复述识别任务中的有效性。

关 键 词:复述识别  正交化  多粒度  
收稿时间:2020-03-25

A Semantic Orthogonal Matching Method for Question Paraphrase Identification
ZHU Mengmeng,WU Kaili,HONG Yu,CHEN Xin,ZHANG Min.A Semantic Orthogonal Matching Method for Question Paraphrase Identification[J].Journal of Chinese Information Processing,2021,35(11):34-42.
Authors:ZHU Mengmeng  WU Kaili  HONG Yu  CHEN Xin  ZHANG Min
Affiliation:School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
Abstract:Question paraphrase identification aims to identify whether two natural questions are semantically equivalence, with a core issue of semantic understanding. Current approach usually encoded the sentences into a vector representations and then the two representations are manipulated to give the proof to judge the equivalence. To further capture the the same and different points of the two questions, this paper propose a model to integrate the semantic orthogonal information. In this method, two questions are classified into similar and different representations, which enriches the representations of the questions and realizes the multi-granularity fusion. Experiments have been conducted on two real-world public datasets: LCQMC and Quora, and results demonstrate is the effectiveness of this method.
Keywords:paraphrase identification  orthogonal  multi-granularity  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号