首页 | 官方网站   微博 | 高级检索  
     

基于大规模语料库的句法模式匹配研究
引用本文:张亮,陈家骏.基于大规模语料库的句法模式匹配研究[J].中文信息学报,2007,21(5):31-35.
作者姓名:张亮  陈家骏
作者单位:1. 南京大学 计算机软件新技术国家重点实验室,江苏 南京 210093;
2. 江苏警官学院 公安科技系 江苏 南京 210000
基金项目:国家高技术研究发展计划(863计划);国家社会科学基金;国家自然科学基金
摘    要:通过大量记录的正确处理实例的分析过程和结果,在句法分析时,搜寻近似实例或片段,匹配相似语言结构和分析过程,这样的句法分析体现了“语言分析依赖经验”的思想。基于这样的思想,本文提出了一种基于模式匹配的句法分析的方法,即从大规模标注语料树库中抽取出蕴含的句法模式,构建模式、子模式及其规约库,句法分析的过程转化为模式匹配和局部模式转换的过程。实验表明句法分析的各项指标都比较理想,尤其是处理效率很高,平均句耗时0.46秒(CPU为Intel双核2.8G,内存为1G)。

关 键 词:计算机应用  中文信息处理  句法分析  模式匹配  句法树库  
文章编号:1003-0077(2007)05-0031-05
收稿时间:2007-04-18
修稿时间:2007-04-182007-06-29

Researches on Large Scale Corpus-Based Syntactic Pattern Matching
ZHANG Liang,CHEN Jia-jun.Researches on Large Scale Corpus-Based Syntactic Pattern Matching[J].Journal of Chinese Information Processing,2007,21(5):31-35.
Authors:ZHANG Liang  CHEN Jia-jun
Affiliation:1. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu 210093, China;
2. Jiangsu Police Institute, Nanjing, Jiangsu 210000, China
Abstract:Based on a large amount of rightly parsed examples with which both parsing procedures and parsing results are recorded,syntactic parsing can be carried out by searching similar example or fragment,and matching similar language structure and analysis in the examples.This embodies the assumption that human language perception and production work with representations of concrete language experiences,rather than with abstract grammar rules.In this paper,we propose a new parsing technique based on syntactic pattern matching.We extract syntactic patterns from a large-scale tree bank,and establish a library of syntactic patterns/sub-patterns and corresponding reduction procedures beforehand.Parsing tasks are fulfilled by pattern matching and partial pattern transforming.The experiments show that the parsing results are satisfying and the program execution speed is very high,achieving 0.46s/per sentence on average(CPU: Intel Core Duo 2.8G,Memory:1G).
Keywords:computer application  Chinese information processing  syntactic parsing  pattern matching  tree bank
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号