基于ERNIE-BiGRU模型的中文文本分类方法 Chinese-text Classification Method Based on ERNIEBiGRU期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于ERNIE-BiGRU模型的中文文本分类方法

引用本文：	雷景生,钱叶.基于ERNIE-BiGRU模型的中文文本分类方法[J].上海电力学院学报,2020,36(4):329-335,350.

作者姓名：	雷景生钱叶

作者单位：	上海电力大学计算机科学与技术学院

摘要：	针对新闻文本分类方法中词向量的表示无法很好地保留字在句子中的信息及其多义性,利用知识增强的语义表示(ERNIE)预训练模型,根据上下文计算出字的向量表示,在保留该字上下文信息的同时也能根据字的多义性进行调整,增强了字的语义表示。在ERNIE模型后增加了双向门限循环单元(Bi GRU),将训练后的词向量作为Bi GRU的输入进行训练,得到文本分类结果。实验表明,该模型在新浪新闻的公开数据集THUCNews上的精确率为94. 32%,召回率为94. 12%,F1值为0. 942 2,在中文文本分类任务中具有良好的性能。
关键词：	文本分类利用知识增强的语义表示模型双向门限循环单元模型预训练模型知识整合
收稿时间：	2020/2/24 0:00:00
Chinese-text Classification Method Based on ERNIEBiGRU

LEI Jingsheng,QIAN Ye.Chinese-text Classification Method Based on ERNIEBiGRU[J].Journal of Shanghai University of Electric Power,2020,36(4):329-335,350.

Authors:	LEI Jingsheng QIAN Ye

Affiliation:	School of Computer Science and Technology, Shanghai University of Electric Power, Shanghai 200082, China

Abstract:	In the news text classification method,the representation of word vectors cannot well preserve the information of the words in the sentence and its ambiguity.Using ERNIE pre-trained model,the vector of words is calculated according to the context.While retaining the context information of the word,it can also be adjusted according to the ambiguity of the word,which enhances the semantic representation of the word.A BiGRU layer is innovatively added after the ERNIE model,and the trained word vector is used as the input of the BiGRU for training to obtain the text classification result.The experiments show that the accuracy of the model on the public data set THUCNews of Sina News is 94.32%,the loss rate is 94.12%,and the F1 value is 0.9422,which has good performance in Chinese text classification tasks.

Keywords:	text classification enhaned representation through knowledge integration bidirectional gated recurrent unit pre-trained model knowledge integration
本文献已被 CNKI 等数据库收录！
	点击此处可从《上海电力学院学报》浏览原始摘要信息
	点击此处可从《上海电力学院学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏