首页 | 官方网站   微博 | 高级检索  
     

基于BERT和联合学习的裁判文书命名实体识别
引用本文:曾兰兰,王以松,陈攀峰.基于BERT和联合学习的裁判文书命名实体识别[J].计算机应用,2022,42(10):3011-3017.
作者姓名:曾兰兰  王以松  陈攀峰
作者单位:贵州大学 计算机科学与技术学院,贵阳 550025
基金项目:国家自然科学基金资助项目(U1836205)
摘    要:正确识别裁判文书中的实体是构建法律知识图谱和实现智慧法院的重要基础。然而常用的命名实体识别(NER)模型并不能很好地解决裁判文书中的多义词表示和实体边界识别错误的问题。为了有效提升裁判文书中各类实体的识别效果,提出了一种基于联合学习和BERT的BiLSTM-CRF(JLB-BiLSTM-CRF)模型。首先,利用BERT对输入字符序列进行编码以增强词向量的表征能力;然后,使用双向长短期记忆(BiLSTM)网络建模长文本信息,并将NER任务和中文分词(CWS)任务进行联合训练以提升实体的边界识别率。实验结果表明,所提模型在测试集上的精确率达到了94.36%,召回率达到了94.94%,F1值达到了94.65%,相较于BERT-BiLSTM-CRF模型分别提升了1.05个百分点、0.48个百分点和0.77个百分点,验证了JLB-BiLSTM-CRF模型在裁判文书NER任务上的有效性。

关 键 词:裁判文书  双向长短期记忆网络  BERT  联合学习  命名实体识别  
收稿时间:2021-09-03
修稿时间:2021-12-02

Named entity recognition based on BERT and joint learning for judgment documents
Lanlan ZENG,Yisong WANG,Panfeng CHEN.Named entity recognition based on BERT and joint learning for judgment documents[J].journal of Computer Applications,2022,42(10):3011-3017.
Authors:Lanlan ZENG  Yisong WANG  Panfeng CHEN
Affiliation:College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
Abstract:Correctly identifying the entities in judgment documents is an important foundation for building legal knowledge graph and realizing smart courts. However, commonly used Named Entity Recognition (NER) models cannot solve the problem of polysemous word representation and entity boundary recognition errors in judgment document well. In order to effectively improve the recognition effect of various entities in the judgment documents, a Bidirectional Long Short-Term Memory with a sequential Conditional Random Field (BiLSTM-CRF) based on Joint Learning and BERT (Bidirectional Encoder Representation from Transformers) (JLB-BiLSTM-CRF) model was proposed. Firstly, the input character sequence was encoded by BERT to enhance the representation ability of word vectors. Then, the long text information was modeled by BiLSTM network, and the NER tasks and Chinese Word Segmentation (CWS) tasks were jointly trained to improve the boundary recognition rate of entities. Experimental results show that this model has the precision of 94.36%, the recall of 94.94%, and the F1 score of 94.65% on the test set, which are 1.05 percentage points, 0.48 percentage points and 0.77 percentage points higher than those of BERT-BiLSTM-CRF model respectively, verifying the effectiveness of JLB-BiLSTM-CRF model in NER tasks for judgment documents.
Keywords:judgment document  Bidirectional Long Short-Term Memory (BiLSTM) network  BERT (Bidirectional Encoder Representation from Transformers)  joint learning  Named Entity Recognition (NER)  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号