首页 | 官方网站   微博 | 高级检索  
     

融合语义联想和BERT的图情领域SAO短文本分类研究
引用本文:张玉洁,白如江,刘明月,于纯良.融合语义联想和BERT的图情领域SAO短文本分类研究[J].图书情报工作,2021,65(16):118-129.
作者姓名:张玉洁  白如江  刘明月  于纯良
作者单位:1.山东理工大学信息管理研究院 淄博 255049;2.烟台大学图书馆 烟台 264005
基金项目:本文系山东省高等学校青创科技支持计划"科技大数据驱动的智慧决策支持创新团队-面向新旧动能转换的新兴科学研究前沿识别研究"(项目编号:2019RWG033)和山东省社科规划处项目"数字环境下科学论文的内容标注模型研究"(项目编号:20CSDJ65)研究成果之一。
摘    要:目的/意义] 针对SAO结构短文本分类时面临的语义特征短缺和领域知识不足问题,提出一种融合语义联想和BERT的SAO分类方法,以期提高短文本分类效果。方法/过程] 以图情领域SAO短文本为数据源,首先设计了一种包含"扩展-重构-降噪"三环节的语义联想方案,即通过语义扩展和SAO重构延展SAO语义信息,通过语义降噪解决扩展后的噪声干扰问题;然后利用BERT模型对语义联想后的SAO短文本进行训练;最后在分类部分实现自动分类。结果/结论] 在分别对比了不同联想值、学习率和分类器后,实验结果表明当联想值为10、学习率为4e-5时SAO短文本分类效果达到最优,平均F1值为0.852 2,与SVM、LSTM和单纯的BERT相比,F1值分别提高了0.103 1、0.153 8和0.140 5。

关 键 词:SAO  短文本分类  语义联想  BERT  
收稿时间:2021-01-27
修稿时间:2021-05-12

Research on SAO Short Text Classification in LIS Based on Semantic Association and BERT
Zhang Yujie,Bai Rujiang,Liu Mingyue,Yu Chunliang.Research on SAO Short Text Classification in LIS Based on Semantic Association and BERT[J].Library and Information Service,2021,65(16):118-129.
Authors:Zhang Yujie  Bai Rujiang  Liu Mingyue  Yu Chunliang
Affiliation:1.Institute of Information Management, Shandong University of Technology, Zibo 255049;2.Yantai University Library, Yantai 264005
Abstract:Purpose/significance] Aiming at the shortage of semantic features and insufficient domain knowledge in the classification of SAO structure short texts, this paper proposes a SAO classification method combining semantic association and BERT in order to improve the classification effect.Method/process] Taking the SAO short text in the library and information science field as the data source, firstly, a semantic association scheme including the three links of "Expansion-Reconstruction-NoiseReduction" was designed. The semantic information of SAO was extended through semantic expansion and SAO reconstruction, and the extended noise interference problem was solved by semantic noise reduction; then used the BERT model to train the SAO short text after semantic association; finally realized automatic classification in the classification part.Result/conclusion] After comparing different association values, learning rates and classifiers, the experimental results show that when the association value is 10 and the learning rate is 4e-5, the SAO short text classification effect is optimal, and the average F1 value is 0.852 2, which is comparable to SVM and LSTM compared with pure BERT, the F1 value is increased by 0.103 1, 0.153 8 and 0.140 5 respectively.
Keywords:SAO  short text classification  semantic association  BERT  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号