首页 | 官方网站   微博 | 高级检索  
     

面向自然语言处理的预训练技术研究综述
引用本文:李舟军,范宇,吴贤杰.面向自然语言处理的预训练技术研究综述[J].计算机科学,2020,47(3):162-173.
作者姓名:李舟军  范宇  吴贤杰
作者单位:北京航空航天大学计算机学院 北京 100191
基金项目:北京成像理论与技术高精尖创新中心课题;软件开发环境国家重点实验室项目;国家自然科学基金
摘    要:近年来,随着深度学习的快速发展,面向自然语言处理领域的预训练技术获得了长足的进步。早期的自然语言处理领域长期使用Word2Vec等词向量方法对文本进行编码,这些词向量方法也可看作静态的预训练技术。然而,这种上下文无关的文本表示给其后的自然语言处理任务带来的提升非常有限,并且无法解决一词多义问题。ELMo提出了一种上下文相关的文本表示方法,可有效处理多义词问题。其后,GPT和BERT等预训练语言模型相继被提出,其中BERT模型在多个典型下游任务上有了显著的效果提升,极大地推动了自然语言处理领域的技术发展,自此便进入了动态预训练技术的时代。此后,基于BERT的改进模型、XLNet等大量预训练语言模型不断涌现,预训练技术已成为自然语言处理领域不可或缺的主流技术。文中首先概述预训练技术及其发展历史,并详细介绍自然语言处理领域的经典预训练技术,包括早期的静态预训练技术和经典的动态预训练技术;然后简要梳理一系列新式的有启发意义的预训练技术,包括基于BERT的改进模型和XLNet;在此基础上,分析目前预训练技术研究所面临的问题;最后对预训练技术的未来发展趋势进行展望。

关 键 词:自然语言处理  预训练  词向量  语言模型

Survey of Natural Language Processing Pre-training Techniques
LI Zhou-jun,FAN Yu,WU Xian-jie.Survey of Natural Language Processing Pre-training Techniques[J].Computer Science,2020,47(3):162-173.
Authors:LI Zhou-jun  FAN Yu  WU Xian-jie
Affiliation:(School of Computer Science and Engineering,Beihang University,Beijing 100191,China)
Abstract:In recent years,with the rapid development of deep learning,the pre-training technology for the field of natural language processing has made great progress.In the early days of natural language processing,the word embedding methods such as Word2Vec were used to encode text.These word embedding methods can also be regarded as static pre-training techniques.However,the context-independent text representation has limitation and cannot solve the polysemy problem.The ELMo pre-training language model gives a context-dependent method that can effectively handle polysemy problems.Later,GPT,BERT and other pre-training language models have been proposed,especially the BERT model,which significantly improves the effect on many typical downstream tasks,greatly promotes the technical development in the field of natural language processing,and thus initia-tes the age of dynamic pre-training.Since then,a number of pre-training language models such as BERT-based improved models and XLNet have emerged,and pre-training techniques have become an indispensable mainstream technology in the field of natural language processing.This paper first briefly introduce the pre-training technology and its development history,and then comb the classic pre-training techniques in the field of natural language processing,including the early static pre-training techniques and the classic dynamic pre-training techniques.Then the paper briefly comb a series of inspiring pre-training techniques,including BERT-based models and XLNet.On this basis,the paper analyze the problems faced by the current pre-training technology.Finally,the future development trend of pre-training technologies is prospected.
Keywords:Natural language processing  Pre-training  Word embedding  Language model
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号