首页 | 官方网站   微博 | 高级检索  
     

藏族人名汉译名识别研究
引用本文:罗智勇,宋柔,朱小杰.藏族人名汉译名识别研究[J].情报学报,2009,28(3).
作者姓名:罗智勇  宋柔  朱小杰
作者单位:北京语言大学语言信息处理研究所,北京,100083
基金项目:国家自然科学基金,教育部科学技术研究重点项目 
摘    要:藏族人名汉译名识别属于人名识别的范畴,但现有的人名识别方法并不能完全切合藏族人名命名特点:藏族人名具有浓厚的宗教文化内涵,字(串)特征和内部构成复杂;其次,藏族人名中含有大量高频单字,使得藏族人名和普通词语之间歧义冲突变得十分突出,同时也使得藏族人名和上下文之间的边界变得非常模糊.本文在大规模藏族人名实例和语料库调查基础上,统计分析了藏族人名的用字(串)特征,并构建了藏族人名属性特征库;通过藏族人名的命名规则及属性特征将藏族人名形式化表示,实现了藏族人名汉译名自动识别系统.真实语料库开放测试F值达到87.12%.

关 键 词:藏族人名识别  未登录词  可信度  自动分词

Research on Recognition of Tibetan Names
Luo Zhiyong,Song Rou,Zhu Xiaojie.Research on Recognition of Tibetan Names[J].Journal of the China Society for Scientific andTechnical Information,2009,28(3).
Authors:Luo Zhiyong  Song Rou  Zhu Xiaojie
Affiliation:Center for Language Information Processing;Beijing Language and Culture University;Beijing 100083
Abstract:Though recognition of Tibetan names is a kind of person-name recognition,current method for recognition of person-names isn't fit to the characters of Tibetan names:Tibetan names have strong religious and cultural meaning,which results in complicated character(string) features and internal structure of Tibetan names;Secondly,Tibetan names contain a lot of frequent single-character words,which makes the ambiguous conflict more outstanding between names and common words,and blurs the border around the Tibetan...
Keywords:recognition of Tibetan names  out-of-vocabulary words  reliability  segmentation  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号