首页 | 官方网站   微博 | 高级检索  
     

A Semi-automatic Method Based on Statistic for Mandarin Semantic Structures Extraction in Specific Domains
作者姓名:熊英  朱杰  孙静
作者单位:Dept.ofElectronicEng.,ShanghaiJiaotongUniv.,Shanghai200030,China
基金项目:FoundationResearchProgram,Science&TechnologyCommitteeofShanghaiMunicipality(No.01JC14033)
摘    要:This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract new key words and augment the semantic lexicon. The resultant semantic structures are interpreted by persons and are amenable to handediting for refinement. In this experiment, the semi-automatically extracted structures SSA provide recall rate of 84.5%.

关 键 词:语义结构  语言模型  半自动提取  语义分组  NLU

A Semi-automatic Method Based on Statistic for Mandarin Semantic Structures Extraction in Specific Domains
XIONG Ying,ZHU Jie,SUN Jing.A Semi-automatic Method Based on Statistic for Mandarin Semantic Structures Extraction in Specific Domains[J].Journal of Shanghai Jiaotong university,2004,9(4):25-29.
Authors:XIONG Ying  ZHU Jie  SUN Jing
Affiliation:Dept. of Electronic Eng., Shanghai Jiaotong Univ., Shanghai 200030, China
Abstract:This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract new
Keywords:and augment the semantic lexicon  The resultant semantic structures are interpreted by persons and are amenable to hand-editing for refinement  In this experiment  the semi-automatically extracted structures S SA provide recall rate of 84  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号