首页 | 官方网站   微博 | 高级检索  
     

BERT-TECNN模型的文本分类方法研究
引用本文:李铁飞,生龙,吴迪.BERT-TECNN模型的文本分类方法研究[J].计算机工程与应用,2021,57(18):186-193.
作者姓名:李铁飞  生龙  吴迪
作者单位:1.河北工程大学 信息与电气工程学院,河北 邯郸 056107 2.河北工程大学 河北省安防信息感知与处理重点实验室,河北 邯郸 056107
摘    要:由于Bert-base,Chinese预训练模型参数巨大,在做分类任务微调时内部参数变化较小,易产生过拟合现象,泛化能力弱,且该模型是以字为单位进行的预训练,包含词信息量较少。针对这些问题,提出了BERT-TECNN模型,模型使用Bert-base,Chinese模型作为动态字向量模型,输出包含深度特征信息的字向量,Transformer encoder层再次对数据进行多头自注意力计算,提取特征信息,以提高模型的泛化能力,CNN层利用不同大小卷积核,捕捉每条数据中不同长度词的信息,最后应用softmax进行分类。该模型与Word2Vec+CNN、Word2Vec+BiLSTM、Elmo+CNN、BERT+CNN、BERT+BiLSTM、BERT+Transformer等深度学习文本分类模型在三种数据集上进行对比实验,得到的准确率、精确率、召回率、F1测度值均为最高。实验表明该模型有效地提取了文本中字词的特征信息,优化了过拟合问题,提高了泛化能力。

关 键 词:bert  transformer  encoder  CNN  文本分类  fine-tuning  self-attention  过拟合  

Study on Text Classification Method of BERT-TECNN Model
LI Tiefei,SHENG Long,WU Di.Study on Text Classification Method of BERT-TECNN Model[J].Computer Engineering and Applications,2021,57(18):186-193.
Authors:LI Tiefei  SHENG Long  WU Di
Affiliation:1.College of Information and Electrical Engineering, Hebei University of Engineering, Handan, Hebei 056107, China 2.Hebei Key Laboratory of Security & Protection Information Sensing and Processing, Hebei University of Engineering, Handan, Hebei 056107, China
Abstract:Due to Bert-base, the parameters of Chinese pre-training model are huge, the internal parameters change little during the fine-tuning of classification task, which is prone to overfitting phenomenon and weak generalization ability. Moreover, this model is pre-training in the unit of words and contains less information of words. To solve these problems, this study proposes the BERT-TECNN model, model uses BERT-base, the Chinese model as a dynamic character vector model to output characteristic information containing the depth of character vector. The transformer encoder layer again long since the attention for the data is calculated, to extract the feature information, in order to improve the generalization ability of the model, CNN layer with different size of convolution kernels, capture different length in each data word information, finally softmax is used for classification. Compared with Word2Vec+CNN, Word2Vec+BiLSTM, Elmo+CNN, BERT+CNN, BERT+BiLSTM, BERT+Transformer and other deep learning text classification models on three data sets, and the accuracy, precision, recall rate and F1 measure values are all the highest. Experiments show that the model can effectively extract the feature information of words in text, optimize the problem of overfitting, and improve the generalization ability.
Keywords:bert  transformer  encoder  CNN  text classification  fine-tuning  self-attention  overfit  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号