首页 | 官方网站   微博 | 高级检索  
     

基于CNN-BLSTM-CRF模型的生物医学命名实体识别
引用本文:李丽双,郭元凯.基于CNN-BLSTM-CRF模型的生物医学命名实体识别[J].中文信息学报,2018,32(1):116-122.
作者姓名:李丽双  郭元凯
作者单位:大连理工大学 计算机科学与技术学院,辽宁 大连 116023
基金项目:国家自然科学基金(61672126)
摘    要:命名实体识别是自然语言处理任务的重要步骤。近年来,不依赖人工特征的神经网络在新闻等通用领域命名实体识别方面表现出了很好的性能。然而在生物医学领域,许多实验表明基于领域知识的人工特征对于神经网络模型的结果影响很大。因此,如何在不依赖人工特征的情况下获得较好的生物医学命名实体识别性能是有待解决的问题。该文提出一种基于CNN-BLSTM-CRF的神经网络模型。首先利用卷积神经网络(CNN)训练出单词的具有形态特征的字符级向量,并从大规模背景语料训练中得到具有语义特征信息的词向量,然后将二者进行组合作为输入,再构建适合生物医学命名实体识别的BLSTM-CRF深层神经网络模型。实验结果表明,不依赖任何人工特征,该文方法在Biocreative Ⅱ GM和JNLPBA2004生物医学语料上都达到了目前最好的结果,F-值分别为89.09%和74.40%。

关 键 词:生物医学命名实体识别  LSTM  CNN  

Biomedical Named Entity Recognition with CNN-BLSTM-CRF
LI Lishuang,GUO Yuankai.Biomedical Named Entity Recognition with CNN-BLSTM-CRF[J].Journal of Chinese Information Processing,2018,32(1):116-122.
Authors:LI Lishuang  GUO Yuankai
Affiliation:School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116023, China
Abstract:Named entity recognition (NER) is one important step in natural language processing (NLP). In recent years, end-to-end neural network models for named entity recognition have shown better performances on general domain datasets (e.g. news), without additional hand-crafted features. However, in the biomedical domain, recent studies indicate that hand-designed features have great impact on the model-s performance. In this paper, we propose a novel end-to-end neural network model: CNN-BLSTM-CRF, which does not rely on the hand-designed features and domain knowledge. CNN (convolutional neural network) extracts the character vectors with shape features from each word, which are concatenated with the word embeddings and fed to the BLSTM-CRF network. We evaluate our approach by comparing against existing neural network models for NER using Biocreative II GM dataset and JNLPBA2004 dataset. The results show that our system reaches 89.09% and 74.40% in F-scores, respectively, and outperforms other state-of-the-art of methods.
Keywords:biomedical NER  LSTM  CNN  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号