首页 | 官方网站   微博 | 高级检索  
     

基于多分类器加权投票法的越南语组合歧义消歧
引用本文:李佳,郭剑毅,刘艳超,余正涛,线岩团,阮氏青娥.基于多分类器加权投票法的越南语组合歧义消歧[J].计算机科学,2018,45(1):167-172.
作者姓名:李佳  郭剑毅  刘艳超  余正涛  线岩团  阮氏青娥
作者单位:昆明理工大学信息工程与自动化学院 昆明650500,昆明理工大学信息工程与自动化学院 昆明650500;昆明理工大学智能信息处理重点实验室 昆明650500,昆明理工大学信息工程与自动化学院 昆明650500,昆明理工大学信息工程与自动化学院 昆明650500;昆明理工大学智能信息处理重点实验室 昆明650500,昆明理工大学信息工程与自动化学院 昆明650500;昆明理工大学智能信息处理重点实验室 昆明650500,昆明理工大学国际学院 昆明650093
基金项目:本文受国家自然科学基金(61262041,61562052,61472168),云南省自然科学基金重点项目(2013FA030)资助
摘    要:组合歧义消解是分词中的关键问题之一,直接影响到分词的准确率。为了解决越南语组合歧义对分词的影响问题,结合越南语组合型词的特点,提出了一种基于集成学习的越南语组合歧义消解方法。该方法首先通过人工选取越南语组合歧义词,构建出越南语组合歧义字段库,对越南语语料与越南语组合词词典进行匹配,抽取出越南语组合歧义字段;其次,采用三类分类器引入越南语词频特征和上下文信息,构建三类分类器消解模型,得到三类分类器消解结果;最后,计算出各分类器权值,通过阈值对越南语组合歧义进行最终分类。实验表明,所提方法的正确率达到了83.32%,与消歧结果最好的单个分类器相比准确率提高了5.81%。

关 键 词:组合词词典  组合歧义消解  越南语  集成学习  加权投票法
收稿时间:2017/5/8 0:00:00
修稿时间:2017/9/27 0:00:00

Vietnamese Combinational Ambiguity Disambiguation Based on Weighted Voting Method of Multiple Classifiers
Affiliation:Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China,Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;The Key Laboratory of Intelligent Information Processing,Kunming University of Science and Technology,Kunming 650500,China,Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China,Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;The Key Laboratory of Intelligent Information Processing,Kunming University of Science and Technology,Kunming 650500,China,Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;The Key Laboratory of Intelligent Information Processing,Kunming University of Science and Technology,Kunming 650500,China and School of International Education,Kunming University of Science and Technology,Kunming 650093,China
Abstract:Combinational ambiguity disambiguation is one of the key issues in participle and it directly affects the accuracy of participle.In order to solve the impact problem of combinational ambiguity on the participle in Vietnamese,combining the features of combinational words of Vietnamese,the paper proposed a Vietnamese combinational ambiguity disambiguation method based on integrated Learning.This method first selects Vietnamese combination of polysemy manually,constructs the Vietnamese combinational ambiguities library, matches Vietnamese and Vietnamese combinational-word dictionary,and extracts Vietnamese combinational ambiguities.Secondly,by using three kinds of classifiers to bring in Vietnamese word frequency features and context information,it constructs three class classifier degradation model,and gets the results.Finally,it calculats the classifier weights through the threshold to determine the final classification of Vietnamese combination ambiguity.Experiments show that the proposed method has the accuracy of 83.32% and its accuracy improves 5.81% compared with the single classifier.
Keywords:Combinational-word dictionary  Combinational ambiguity disambiguation  Vietnamese  Integrated learning  Weighted voting method
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号