首页 | 官方网站   微博 | 高级检索  
     

基于门控卷积机制与层次注意力机制的多语义词向量计算方法
引用本文:柳杨,吉立新,黄瑞阳,朱宇航,李星.基于门控卷积机制与层次注意力机制的多语义词向量计算方法[J].中文信息学报,2018,32(7):1.
作者姓名:柳杨  吉立新  黄瑞阳  朱宇航  李星
作者单位:国家数字交换系统工程技术研究中心,河南 郑州 450002
基金项目:国家自然科学基金(61601513)
摘    要:现有的将词映射为单一向量的方法没有考虑词的多义性,从而会引发歧义问题;映射为多个向量或高斯分布的方法虽然考虑了词的多义性,但或多或少没能有效利用词序、句法结构和词间距离等信息对词在某一固定语境中语义表达的影响。综合考虑以上存在的问题,该文提出了一种基于非残差块封装的门控卷积机制加以层次注意力机制的方法,分别在所选取语境窗口中词的子语义层、合成语义层获得非对称语境窗口下目标单词的合成语义向量以预测目标单词,并按此法在给定语料上学习得到多语义词向量的计算方法。小规模语料上用该方法得到的多语义词向量,在词类比任务的语义类比上相比于基线方法准确率最高可提升1.42%;在WordSim353、MC、RG、RW等计算单词相似度任务的数据集上相比于基线方法能够达到平均2.11的性能提升,最高可到5.47。在语言建模实验上,该方法的语言模型性能相比于其他预测目标单词的方法也有显著提升。

关 键 词:多语义词向量  层次注意力  门控卷积  

A Multi-sense Word Embedding Method Based on Gated Convolution and Hierarchical Attention Mechanism
LIU Yang,JI Lixin,HUANG Ruiyang,ZHU Yuhang,LI Xing.A Multi-sense Word Embedding Method Based on Gated Convolution and Hierarchical Attention Mechanism[J].Journal of Chinese Information Processing,2018,32(7):1.
Authors:LIU Yang  JI Lixin  HUANG Ruiyang  ZHU Yuhang  LI Xing
Affiliation:National Digital Switching System Engineering and Technological R & D Center, Zhengzhou, Henan 450002, China
Abstract:The existing methods (mapping a word to a single vector) do not consider the problem of polysemy, which may cause the problem of ambiguity; Rather than mapping a word to multiple vectors, this paper proposes a computing method of multi-sense word embedding by: 1) fusing hierarchical attention mechanism with non-residual encapsulated gated convolution mechanism in the sub-sense layer and synthetic sense layer of the words in the selected context window, and 2) obtains the synthetic sense embedding of the target word under the asymmetric window to predict the target word. On small-scale corpus, the proposed multi-sense word embedding achieves at most 1.42% increase in the accuracy of the word analogy task, an average 2.11% (up to 5.47%) improvement in the word similarity tasks including WordSim353, MC, RG, and RW. In addition, this method also significantly improves the performance of the language modeling compared with other methods predicting target words.
Keywords:Multi-sense word embedding  hierarchical attention  gated convolution  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号