首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
一种基于句法语义特征的汉语句法分析器   总被引:4,自引:2,他引:2  
句法分析不是简单地符号推理,而应该是一种实体推理。增加语义信息是实现句法分析实体推理的有效手段。本文所介绍的句法分析器有两个特色:一是利用基于词的兼类处理规则大大提高了句法分析的效率;二是利用词静态和动态的句法语义特征来限制句法规则过强的生成能力,取得了较好的效果。  相似文献   

2.
The paper is the third in a series of three papers devoted to a detailed study of LR(k) parsing with error recovery and correction. A new class of syntax errors is introduced, called (k)-local parser defined errors, which suit better than the conventional minimum distance errors for characterization of error detection and recovery in LR(k) parsing. The question whether a given string has n k-local parser defined errors for some integer n is shown to be decidable. Using the formalization of LR(k) parsing and error recovery presented in the first and the second paper in the series it is shown that the canonical LR(k) parser of an LR(k) grammar always has an error recovering extension which is able to produce a correction for any terminal string containing only (k)-local parser defined errors.  相似文献   

3.
谢德峰  吉建民 《计算机应用》2021,41(9):2489-2495
在自然语言处理(NLP)中,句法信息是完整句子中词汇与词汇之间的句法结构关系或者依存关系,是一种重要且有效的参考信息。语义解析任务是将自然语言语句直接转化成语义完整的、计算机可执行的语言。在以往的语义解析研究中,少有采用输入源的句法信息来提高端到端语义解析效率的工作。为了进一步提高端到端语义解析模型的准确率和效率,提出一种利用输入端句法依存关系信息来提高模型效率的语义解析方法。该方法的基本思路是先对一个端到端的依存关系解析器进行预训练;然后将该解析器的中间表示作为句法感知表示,与原有的字词嵌入表示拼接到一起以产生新的输入嵌入表示,并将得到的输入嵌入表示用于端到端语义解析模型;最后采用转导融合学习方式进行模型融合。实验对比了所提模型和基准模型Transformer以及过去十年的相关工作。实验结果表明,在ATIS、GEO、JOBS数据集上,融入依存句法信息感知表示以及转导融合学习的语义解析模型分别实现了89.1%、90.7%、91.4%的最佳准确率,全面超过了Transformer,验证了引入句法依存关系信息的有效性。  相似文献   

4.
Koen De Bosschere 《Software》1996,26(7):763-779
Prolog is a language with a dynamic grammar which is the result of embedded operator declarations. The parsing of such a language cannot be done easily by means of standard tools. Most often, an existing parsing technique for a static grammar is adapted to deal with the dynamic constructs. This paper uses the syntax definition as defined by the ISO standard for the Prolog language. It starts with a brief discussion of the standard, highlighting some aspects that are important for the parser, such as the restrictions on the use of operators as imposed by the standard in order to make the parsing deterministic. Some possible problem areas are also indicated. As output is closely related to input in Prolog, both are treated in this paper. Some parsing techniques are compared and an operator precedence parser is chosen to be modified to deal with the dynamic operator declarations. The necessary modifications are discussed and an implementation in C is presented. Performance data are collected and compared with a public domain Prolog parser written in Prolog. It is the first efficient public domain parser for Standard Prolog that actually works and deals with all the details of the syntax.  相似文献   

5.
一个基于GLR算法的英汉机器翻译浅层句法分析器   总被引:5,自引:0,他引:5  
浅层句法分析是指短语级的自然语言句法分析。在研制MatLink英汉机器翻译系统的过程中,提出了扩充的CFG文法用于描述英语短语句法,并改进了GLR算法,设计实现了用于英汉翻译的英语浅层句法分析器。该分析器采用多出口的分析表结构,引入符号映射函数实现短语边界的自动识别,用孩子兄弟树描述短语的句法结构,并通过短语转换模式实现源语言向目标语言的短语级转换。最后,通过对一个实例句子的分析阐述了该浅层句法分析器的设计思想和工作过程。  相似文献   

6.
Visual YACC is a tool that automatically creates visualizations of the YACC LR parsing process and synthesized attribute computation. The Visual YACC tool works by instrumenting a standard YACC grammar with graphics calls that draw the appropriate data structures given the current actions by the parser. The new grammar is processed by the YACC tools and the resulting parser displays the parse stack and parse tree for every step of the parsing process of a given input string. Visual YACC was initially designed to be used in compiler construction courses to supplement the teaching of parsing and syntax directed evaluation. We have also found it to be useful in the difficult task of debugging YACC grammars. In this paper, we describe this tool and how it is used in both contexts. We also detail two different implementations of this tool: one that produces a parser written in C with calls to Motif; and a second implementation that generates Java source code. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

7.
一种通用的植物逼真几何建模方法   总被引:4,自引:0,他引:4       下载免费PDF全文
针对使用 L 系统进行植物几何建模的具体过程随规则定义的变化而变化的问题 ,提出了一种较为通用的基于 L 系统规则语言分析器的解决方法 ,即通过归纳和抽象得到可以定义多种 L 系统规则的语言 L- plants,并为其构造语言分析器 ,完成 L 系统开始状态和规则的识别 ,进行规则替换 ,以形成最终的字符串 ,最后使用形状语法对字符串进行解释 ,建立出植物的几何模型 .实验证明 ,该方法可以较大幅度地提高植物几何建模的效率  相似文献   

8.
Phil Cook  Jim Welsh 《Software》2001,31(15):1461-1486
Incremental parsing has long been recognized as a technique of great utility in the construction of language‐based editors, and correspondingly, the area currently enjoys a mature theory. Unfortunately, many practical considerations have been largely overlooked in previously published algorithms. Many user requirements for an editing system necessarily impact on the design of its incremental parser, but most approaches focus only on one: response time. This paper details an incremental parser based on LR parsing techniques and designed for use in a modeless syntax recognition editor. The nature of this editor places significant demands on the structure and quality of the document representation it uses, and hence, on the parser. The strategy presented here is novel in that both the parser and the representation it constructs are tolerant of the inevitable and frequent syntax errors that arise during editing. This is achieved by a method that differs from conventional error repair techniques, and that is more appropriate for use in an interactive context. Furthermore, the parser aims to minimize disturbance to this representation, not only to ensure other system components can operate incrementally, but also to avoid unfortunate consequences for certain user‐oriented services. The algorithm is augmented with a limited form of predictive tree‐building, and a technique is presented for the determination of valid symbols for menu‐based insertion. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

9.
为了进一步提高哈萨克语句法分析的准确率,为哈萨克语自然语言处理奠定良好基础,对基于转移的哈萨克语句法分析进行研究,采用改进后的基于转移的方法对句法树进行处理,即中序遍历句法树的方法将句法树转换为动作序列。使用神经网络构建句法分析器框架,分别使用三个长短期记忆网络(LSTM)表示堆栈信息、缓冲区信息以及动作历史信息对模型进行训练,根据所得到的概率预测动作序列,从而得到句法分析的结果。改进后的转移方法得到的句法分析准确率为74.37%。  相似文献   

10.
基于移进归约的句法分析系统具有线性的时间复杂度,因此在大规模句法分析任务中具有特别实际的意义。然而目前移进归约句法分析系统的性能远低于领域内最好的句法分析器,例如,伯克利句法分析器。该文研究如何利用向上学习和无标注数据改进移进归约句法分析系统,使之尽可能接近伯克利句法分析器的性能。我们首先应用伯克利句法分析器对大规模的无标注数据进行自动分析,然后利用得到的自动标注数据作为额外的训练数据改进词性标注系统和移进归约句法分析器。实验结果表明,向上学习方法和无标注数据使移进归约句法分析的性能提高了2.3%,达到82.4%。这个性能与伯克利句法分析器的性能可比。与此同时,该文最终得到的句法分析系统拥有明显的速度优势(7倍速度于伯克利句法分析器)。  相似文献   

11.
Summary Affix grammars are an extension of context-free grammars which retain most of their advantages and eliminate most of their limitations with respect to the definition of programming languages and the specification of their translators. The extension allows definition of context-sensitive syntax features, and also allows semantics to be linked to syntax. In this paper, the parsing problem for affix grammars is explored and shown to be closely related to the parsing problem for context-free grammars. This enables a standard context-free parser constructor to be generalised to a constructor for affix grammars, essentially by addition of a preprocessor. The resulting constructors are compared with previously implemented or proposed constructors.  相似文献   

12.
Theory and algorithm for optimization of a directed and labeled tree are presented. Their application for optimizing any finite pattern grammar represented in the form of a tree is discussed. Tree optimization leads to loss information which is essential for identification of patterns. Special technique for preserving this information has been suggested.Finally, outlines of two different algorithms for the parsing of patterns are included. The tree parser uses the optimized tree and the table-driven parser uses the optimized syntax stored in four separate tables.  相似文献   

13.
H. Mssenbck 《Software》1988,18(7):691-700
We present a simple method for connecting semantic actions to parsers. Although applicable to any kind of parser it is especially suited for LR parsers. The method is based on the idea of separating syntax analysis and semantic processing and executing semantic actions by procedures, similar to those of a recursive descent compiler. The procedures are driven by structural information about the source program, which is collected during parsing. The method is applicable to L-attributed grammars. It can be incorporated easily into any existing parser.  相似文献   

14.
Logic programs resemble context-free grammars. Moreover, Prolog’s proof procedure can be viewed as a generalization of a simple top-down parser with backtracking. This simple parser has disadvantages that motivated the design of more sophisticated parsing methods. As similar disadvantages occur in Prolog’s proof procedure, it may be desirable to develop other proof procedures for logic programs than the one used by Prolog. The resemblance between definite clauses and productions suggests looking at parsing to develop such procedures. We obtain proof procedures for fixed-mode logic programs, based on “chart” parsers. Our approach concentrates on transforming (fixed-mode) logic programs rather than the parser. We first add unification to a chart parser obtaining a proof procedure for programs severely restricted in their syntax, in which the body of the clauses denotes the composition of binary relations: “chain” programs. We then show how to transform fixed-mode programs into chain form. We arrive at proof procedures that avoid some nonterminating loops as well as the recomputation of some partial results.  相似文献   

15.
Adaptable Parsing Expression Grammar (APEG) is a formal method for defining the syntax of programming languages. It provides an on-the-fly mechanism to perform modifications of the syntax of the language during parsing time. The primary goal of this dynamic mechanism is the formal specification and the automatic parser generation for extensible languages. In this paper, we show how APEG can be used for the definition of the extensible languages SugarJ and Fortress, clarifying many aspects of the syntax of these languages. We also show that the mechanism for on-the-fly modification of syntax rules can be useful for defining grammars in a modular way, implementing almost all types of language composition in the context of specification of extensible languages.  相似文献   

16.
本文研究了PCFG独立性假设的局限性,并针对这一局限性提出了句法结构共现的概念以引入上下文信息,给出了计算方法;为了打破中文树库规模过小的局限性,对于句法规则参数的获取,本文利用Inside-Outside算法进行迭代,最后提出了一个基于统计模型的自顶向下的汉语句法分析器。在封闭测试下,其标记精确率和标记召回率分别为88.1%和86.8%。实验结果表明,这种方法确实能够提高标记的精确率和召回率,值得深入研究。  相似文献   

17.
一种现代汉语句法分析方法的建立与实现   总被引:1,自引:0,他引:1  
本文以7万小学生语文课本分词语料为基础,建立一个隐马尔可夫模型与层次分析法相结合的完全句法分析方法,实现了现代汉语完全句法分析。实验结果表明,该方法具有一定的独创性和高效性,其完全句法分析正确率在封闭和开放测试中分别为92.43%和65.374%。  相似文献   

18.
International patent corpus is a gigantic source containing today about 80 million of documents. Every patent is manually analyzed by patent officers and then classified by a specific code called Patent Class (PC). Cooperative Patent Classification CPC is the new classification system introduced since January 2013 in order to standardize the classification systems of all major patent offices. Like keywords for papers, PCs point to the core of the invention, describing concisely what they contain inside. Most of patents strategies are based on PC as filter for results therefore the selection of relevant PCs is often a primary and crucial activity. This task is considered particularly challenging and only few tools have been specially developed for this purpose. The most efficient tools are provided by patent offices of EPO and WIPO.This paper analyzes their PCs search strategy (mainly based on keyword-based engines) in order to identify main limitations in terms of missing relevant PCs (recall) and non-relevant results (precision). Patents have been processed by KOM, a semantic patent search tool developed by the authors. Unlike all other PC search tools, KOM uses semantic parser and many knowledge bases for carrying out a conceptual patent search. Its functioning is described step by step through a detailed analysis pointing out the benefits of a concept-based search vis-à-vis a keyword-based search. An exemplary case is proposed dealing with CPCs describing the sterilization of contact lenses. Comparison could be likewise conducted on other PCs such as International (IPC), European (ECLA) or United States (USPC) patent classification codes.  相似文献   

19.
传统的报文解析器解析的协议类型和协议层次固定,缺乏对新网络协议的支撑,限制了网络设备的可编程性。抽象出形式化的解析流程,并基于FPGA实现协议无关的可编程解析器,对新协议的支撑无需更改硬件,仅需要重新映射解析图。基于该机制,引入一系列优化技术,克服了包解析固有的串行性,节约了存储资源,为实现高速的可编程报文解析提供了有效的解决方案。基于通用多核和高性能FPGA实验平台,进行了硬件代价和性能的评估。实验结果表明,采用可编程解析器能大幅提升报文解析性能,实现了通用网络协议及潜在的网络协议快速的解析,可有效地支持快速的定制网络协议发展。  相似文献   

20.
Jean Bovet  Terence Parr 《Software》2008,38(12):1305-1332
Programmers tend to avoid using language tools, resorting to ad hoc methods, because tools can be hard to use, their parsing strategies can be difficult to understand and debug, and their generated parsers can be opaque black‐boxes. In particular, there are two very common difficulties encountered by grammar developers: understanding why a grammar fragment results in a parser non‐determinism and determining why a generated parser incorrectly interprets an input sentence. This paper describes ANTLRWorks, a complete development environment for ANTLR grammars that attempts to resolve these difficulties and, in general, make grammar development more accessible to the average programmer. The main components are a grammar editor with refactoring and navigation features, a grammar interpreter, and a domain‐specific grammar debugger. ANTLRWorks' primary contributions are a parser non‐determinism visualizer based on syntax diagrams and a time‐traveling debugger that pays special attention to parser decision‐making by visualizing lookahead usage and speculative parsing during backtracking. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号