BERT-TECNN模型的文本分类方法研究 Study on Text Classification Method of BERT-TECNN Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

BERT-TECNN模型的文本分类方法研究

引用本文：	李铁飞,生龙,吴迪.BERT-TECNN模型的文本分类方法研究[J].计算机工程与应用,2021,57(18):186-193.

作者姓名：	李铁飞生龙吴迪

作者单位：	1.河北工程大学信息与电气工程学院，河北邯郸 056107 2.河北工程大学河北省安防信息感知与处理重点实验室，河北邯郸 056107

摘要：	由于Bert-base，Chinese预训练模型参数巨大，在做分类任务微调时内部参数变化较小，易产生过拟合现象，泛化能力弱，且该模型是以字为单位进行的预训练，包含词信息量较少。针对这些问题，提出了BERT-TECNN模型，模型使用Bert-base，Chinese模型作为动态字向量模型，输出包含深度特征信息的字向量，Transformer encoder层再次对数据进行多头自注意力计算，提取特征信息，以提高模型的泛化能力，CNN层利用不同大小卷积核，捕捉每条数据中不同长度词的信息，最后应用softmax进行分类。该模型与Word2Vec+CNN、Word2Vec+BiLSTM、Elmo+CNN、BERT+CNN、BERT+BiLSTM、BERT+Transformer等深度学习文本分类模型在三种数据集上进行对比实验，得到的准确率、精确率、召回率、F1测度值均为最高。实验表明该模型有效地提取了文本中字词的特征信息，优化了过拟合问题，提高了泛化能力。
关键词：	bert transformer encoder CNN 文本分类 fine-tuning self-attention 过拟合
Study on Text Classification Method of BERT-TECNN Model

LI Tiefei,SHENG Long,WU Di.Study on Text Classification Method of BERT-TECNN Model[J].Computer Engineering and Applications,2021,57(18):186-193.

Authors:	LI Tiefei SHENG Long WU Di

Affiliation:	1.College of Information and Electrical Engineering, Hebei University of Engineering, Handan, Hebei 056107, China 2.Hebei Key Laboratory of Security & Protection Information Sensing and Processing, Hebei University of Engineering, Handan, Hebei 056107, China

Abstract:	Due to Bert-base, the parameters of Chinese pre-training model are huge, the internal parameters change little during the fine-tuning of classification task, which is prone to overfitting phenomenon and weak generalization ability. Moreover, this model is pre-training in the unit of words and contains less information of words. To solve these problems, this study proposes the BERT-TECNN model, model uses BERT-base, the Chinese model as a dynamic character vector model to output characteristic information containing the depth of character vector. The transformer encoder layer again long since the attention for the data is calculated, to extract the feature information, in order to improve the generalization ability of the model, CNN layer with different size of convolution kernels, capture different length in each data word information, finally softmax is used for classification. Compared with Word2Vec+CNN, Word2Vec+BiLSTM, Elmo+CNN, BERT+CNN, BERT+BiLSTM, BERT+Transformer and other deep learning text classification models on three data sets, and the accuracy, precision, recall rate and F1 measure values are all the highest. Experiments show that the model can effectively extract the feature information of words in text, optimize the problem of overfitting, and improve the generalization ability.

Keywords:	bert transformer encoder CNN text classification fine-tuning self-attention overfit
本文献已被万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏