基于BERT和联合学习的裁判文书命名实体识别 Named entity recognition based on BERT and joint learning for judgment documents期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于BERT和联合学习的裁判文书命名实体识别

引用本文：	曾兰兰,王以松,陈攀峰.基于BERT和联合学习的裁判文书命名实体识别[J].计算机应用,2022,42(10):3011-3017.

作者姓名：	曾兰兰王以松陈攀峰

作者单位：	贵州大学计算机科学与技术学院，贵阳 550025

基金项目：	国家自然科学基金资助项目(U1836205)

摘要：	正确识别裁判文书中的实体是构建法律知识图谱和实现智慧法院的重要基础。然而常用的命名实体识别（NER）模型并不能很好地解决裁判文书中的多义词表示和实体边界识别错误的问题。为了有效提升裁判文书中各类实体的识别效果，提出了一种基于联合学习和BERT的BiLSTM-CRF（JLB-BiLSTM-CRF）模型。首先，利用BERT对输入字符序列进行编码以增强词向量的表征能力；然后，使用双向长短期记忆（BiLSTM）网络建模长文本信息，并将NER任务和中文分词（CWS）任务进行联合训练以提升实体的边界识别率。实验结果表明，所提模型在测试集上的精确率达到了94.36%，召回率达到了94.94%，F1值达到了94.65%，相较于BERT-BiLSTM-CRF模型分别提升了1.05个百分点、0.48个百分点和0.77个百分点，验证了JLB-BiLSTM-CRF模型在裁判文书NER任务上的有效性。
关键词：	裁判文书双向长短期记忆网络 BERT 联合学习命名实体识别
收稿时间：	2021-09-03
修稿时间：	2021-12-02
Named entity recognition based on BERT and joint learning for judgment documents

Lanlan ZENG,Yisong WANG,Panfeng CHEN.Named entity recognition based on BERT and joint learning for judgment documents[J].journal of Computer Applications,2022,42(10):3011-3017.

Authors:	Lanlan ZENG Yisong WANG Panfeng CHEN

Affiliation:	College of Computer Science and Technology，Guizhou University，Guiyang Guizhou 550025，China

Abstract:	Correctly identifying the entities in judgment documents is an important foundation for building legal knowledge graph and realizing smart courts. However， commonly used Named Entity Recognition （NER） models cannot solve the problem of polysemous word representation and entity boundary recognition errors in judgment document well. In order to effectively improve the recognition effect of various entities in the judgment documents， a Bidirectional Long Short-Term Memory with a sequential Conditional Random Field （BiLSTM-CRF） based on Joint Learning and BERT （Bidirectional Encoder Representation from Transformers）（JLB-BiLSTM-CRF） model was proposed. Firstly， the input character sequence was encoded by BERT to enhance the representation ability of word vectors. Then， the long text information was modeled by BiLSTM network， and the NER tasks and Chinese Word Segmentation （CWS） tasks were jointly trained to improve the boundary recognition rate of entities. Experimental results show that this model has the precision of 94.36%， the recall of 94.94%， and the F1 score of 94.65% on the test set， which are 1.05 percentage points， 0.48 percentage points and 0.77 percentage points higher than those of BERT-BiLSTM-CRF model respectively， verifying the effectiveness of JLB-BiLSTM-CRF model in NER tasks for judgment documents.

Keywords:	judgment document Bidirectional Long Short-Term Memory (BiLSTM) network BERT (Bidirectional Encoder Representation from Transformers) joint learning Named Entity Recognition (NER)

	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏