首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 829 毫秒
1.
A new and simple method for the inference of regular grammars or finite automata is presented. It is based on the analysis of the successive appearances of the terminal symbols in the learning strings. It is shown that for syntactic pattern recognition applications, this method is more efficient than other algorithms already proposed.  相似文献   

2.
Dashed lines are a common, and semantically important, element in line drawings. This paper deals with the inference of a dashed lines' grammar, given a stream of graphical symbols. A method is presented, based on syntactical pattern recognition, capable of inferring arbitrary grammars without a priori knowledge. A detailed complexity-analysis of the developed algorithms is presented, as well as experiments demonstrating the usefullness of our method.  相似文献   

3.
Building parsers is an essential task for the development of many tools, from software maintenance tools to any kind of business-specific, programmable environment having a command-line interface. Whilst grammars for many programming languages are available, these are, very often, almost useless because of the large diffusion of dialects and variants not contemplated by standard grammars. Writing a grammar by hand is clearly feasible, however it can be a tedious and error-prone task, requiring appropriate skills not always available. Grammar inference is a possible, challenging approach for obtaining suitable grammars from program examples. However, inference from scratch poses serious scalability issues and tends to produce correct, but meaningless grammars, hard to be understood and used to build tools. This paper describes an approach, based on genetic algorithms, for evolving existing grammars towards target (dialect) grammars, inferring changes from examples written using the dialect. Results obtained experimenting the inference of C dialect rules show that the algorithm is able to successfully evolve the grammar. Inspections indicated that the changes automatically made to the grammar during its evolution preserved its meaningfulness, and were comparable to what a developer could have done by hand.  相似文献   

4.
The high complexity of natural language and the huge amount of human and temporal resources necessary for producing the grammars lead several researchers in the area of Natural Language Processing to investigate various solutions for automating grammar generation and updating processes. Many algorithms for Context-Free Grammar inference have been developed in the literature. This paper provides a survey of the methodologies for inferring context-free grammars from examples, developed by researchers in the last decade. After introducing some preliminary definitions and notations concerning learning and inductive inference, some of the most relevant existing grammatical inference methods for Natural Language are described and classified according to the kind of presentation (if text or informant) and the type of information (if supervised, unsupervised, or semi-supervised). Moreover, the state of the art of the strategies for evaluation and comparison of different grammar inference methods is presented. The goal of the paper is to provide a reader with introduction to major concepts and current approaches in Natural Language Learning research.  相似文献   

5.
Grammatical inference – used successfully in a variety of fields such as pattern recognition, computational biology and natural language processing – is the process of automatically inferring a grammar by examining the sentences of an unknown language. Software engineering can also benefit from grammatical inference. Unlike these other fields, which use grammars as a convenient tool to model naturally occurring patterns, software engineering treats grammars as first-class objects typically created and maintained for a specific purpose by human designers. We introduce the theory of grammatical inference and review the state of the art as it relates to software engineering.  相似文献   

6.
Several equivalence decision algorithms for classes of grammars, program schemes, transducers follow the general pattern of the Korenjak-Hopcroft algorithm for deciding the equivalence of simple deterministic grammars. An axiomatic framework is presented which points out the essence of the Korenjak-Hopcroft algorithm and applies to numerous situations.  相似文献   

7.
Inference of high-dimensional grammars is discussed. Specifically, techniques for inferring tree grammars are briefly presented. The problem of inferring a stochastic grammar to model the behavior of an information source is also introduced and techniques for carrying out the inference process are presented for a class of stochastic finite-state and context-free grammars. The possible practical application of these methods is illustrated by examples.  相似文献   

8.
We present a compiler that can be used to automatically obtain efficient Java implementations of parsing algorithms from formal specifications expressed as parsing schemata. The system performs an analysis of the inference rules in the input schemata in order to determine the best data structures and indexes to use, and to ensure that the generated implementations are efficient. The system described is general enough to be able to handle all kinds of schemata for different grammar formalisms, such as context‐free grammars and tree‐adjoining grammars, and it provides an extensibility mechanism allowing the user to define custom notational elements. This compiler has proven very useful for analyzing, prototyping and comparing natural‐language parsers in real domains, as can be seen in the empirical examples provided at the end of the paper. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

9.
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.  相似文献   

10.
Pattern selector grammars are defined in general. We concentrate on the study of special grammars, the pattern selectors of which contain precisely κ “one”s (01(101)κ) or κ adjacent “one”s (011κ01). This means that precisely κ symbols (resp. κ adjacent symbols) in each sentential form are rewritten. The main results concern parsing algorithms and the complexity of the membership problem. We first obtain a polynomial bound on the shortest derivation and hence an NP time bound for parsing. In the case κ = 2, we generalize the well-known context-free dynamic programming type algorithms, which run in polynomial time. It is shown that the generated languages, for κ = 2, are log-space reducible to the context-free languages. The membership problem is thus solvable in log2 space.  相似文献   

11.
In this paper, fuzzy inference models for pattern classifications have been developed and fuzzy inference networks based on these models are proposed. Most of the existing fuzzy rule-based systems have difficulties in deriving inference rules and membership functions directly from training data. Rules and membership functions are obtained from experts. Some approaches use backpropagation (BP) type learning algorithms to learn the parameters of membership functions from training data. However, BP algorithms take a long time to converge and they require an advanced setting of the number of inference rules. The work to determine the number of inference rules demands lots of experiences from the designer. In this paper, self-organizing learning algorithms are proposed for the fuzzy inference networks. In the proposed learning algorithms, the number of inference rules and the membership functions in the inference rules will be automatically determined during the training procedure. The learning speed is fast. The proposed fuzzy inference network (FIN) classifiers possess both the structure and the learning ability of neural networks, and the fuzzy classification ability of fuzzy algorithms. Simulation results on fuzzy classification of two-dimensional data are presented and compared with those of the fuzzy ARTMAP. The proposed fuzzy inference networks perform better than the fuzzy ARTMAP and need less training samples.  相似文献   

12.
Fuzzy context-free max- grammar (or FCFG, for short), as a straightforward extension of context-free grammar, has been introduced to express uncertainty, imprecision, and vagueness in natural language fragments. Li recently proposed the approximation of fuzzy finite automata, which may effectively deal with the practical problems of fuzziness, impreciseness and vagueness. In this paper, we further develop the approximation of fuzzy context-free grammars. In particular, we show that a fuzzy context-free grammar under max- compositional inference can be approximated by some fuzzy context-free grammar under max-min compositional inference with any given accuracy. In addition, some related properties of fuzzy context-free grammars and fuzzy languages generated by them are studied. Finally, the sensitivity of fuzzy context-free grammars is also discussed.  相似文献   

13.
When concerned about efficient grammatical inference two issues are relevant: the first one is to determine the quality of the result, and the second is to try to use polynomial time and space. A typical idea to deal with the first point is to say that an algorithm performs well if it infers in the limit the correct language. The second point has led to debate about how to define polynomial time: the main definitions of polynomial inference have been proposed by Pitt and Angluin. We return in this paper to a definition proposed by Gold that requires a characteristic set of strings to exist for each grammar, and this set to be polynomial in the size of the grammar or automaton that is to be learned, where the size of the sample is the sum of the lengths of all strings it includes. The learning algorithm must also infer correctly as soon as the characteristic set is included in the data. We first show that this definition corresponds to a notion of teachability as defined by Goldman and Mathias. By adapting their teacher/learner model to grammatical inference we prove that languages given by context-free grammars, simple deterministic grammars, linear grammars and nondeterministic finite automata are not identifiable in the limit from polynomial time and data.  相似文献   

14.
Producing sentences from a grammar, according to various criteria, is required in many applications. It is also a basic building block for grammar engineering. This paper presents a toolkit for context-free grammars, which mainly consists of several algorithms for sentence generation or enumeration and for coverage analysis for context-free grammars. The toolkit deals with general context-free grammars. Besides providing implementations of algorithms, the toolkit also provides a simple graphical user interface, through which the user can use the toolkit directly. The toolkit is implemented in Java and is available at http://lcs.ios.ac.cn/~zhiwu/toolkit.php. In the paper, the overview of the toolkit and the major algorithms implemented in the toolkit are presented, and experimental results and preliminary applications of the toolkit are also contained.  相似文献   

15.
Tree systems have been studied in the theoretical setting as an extension of finite automata. They have been found useful in the practical domain when applied to syntactic pattern recognition. The practical applications of tree systems have motivated the examination of inference techniques for tree grammars and tree automata. In this paper we present a tree automaton inference algorithm which incorporates three concepts-tree derivatives, grammatical expansion, and inferential strength. Tree derivatives are used for comparing tree forms. Grammatical expansion is a feedback mechanism which effectively enlarges the user sample. An inferential strength parameter is input by the user to indicate the amount of support required from the sample for inferences. The algorithm is also applied for inferring finite-state machines. Finally, we address an open problem posed by Joshi and Levy by demonstrating the use of our algorithm for the design of programming languages.  相似文献   

16.
A family of parsing algorithms for general context-free grammars is described. Its members have a top-down structure, while performing tests similar to those introduced in both LR and LL algorithms. A theoretical study shows that the algorithms have the same time and space bounds as Earley's algorithms, which are particular members of the family. Empirical comparisons show the effectiveness of the new members of the family, which appear to be definitely better than Earley's algorithms, except for a few pathological grammars.  相似文献   

17.
A general method is proposed for incorporating rule-based constraints corresponding to regular languages into stochastic inference problems, thereby allowing for a unified representation of stochastic and syntactic pattern constraints. The authors' approach establishes the formal connection of rules to Chomsky grammars and generalizes the original work of Shannon on the encoding of rule-based channel sequences to Markov chains of maximum entropy. This maximum entropy probabilistic view leads to Gibbs representations with potentials which have their number of minima growing at precisely the exponential rate that the language of deterministically constrained sequences grow. These representations are coupled to stochastic diffusion algorithms, which sample the language-constrained sequences by visiting the energy minima according to the underlying Gibbs probability law. This coupling yields the result that fully parallel stochastic cellular automata can be derived to generate samples from the rule-based constraint sets. The production rules and neighborhood state structure of the language of sequences directly determine the necessary connection structures of the required parallel computing surface. Representations of this type have been mapped to the DAP-510 massively parallel processor consisting of 1024 mesh-connected bit-serial processing elements for performing automated segmentation of electron-micrograph images.  相似文献   

18.
Extended context-free grammars allow regular expressions to appear in productions right hand sides, and are a clear and natural way to describe the syntax of programming languages.In this paper an LR parsing technique for extended context-free grammars is presented, which is based on an underlying transformation of the grammar into an equivalent context-free one.The technique is suitable for inclusion in one-pass compilers: the implementation requires little extensions to the algorithms working for normal LR grammars. Besides describing the parsing method, the paper shows also the algorithms for deriving the parsing tables; tables optimization is also discussed. Finally, this technique is compared with other proposals appeared in the literature.  相似文献   

19.
Interior-defined regions, flood-fill algorithms, and a simple 4-connected region filling algorithm are presented together with their properties. an algorithm for region filling using two-dimensional grammars is also presented together with illustrative examples. the results obtained in this article may have useful application in intelligent systems, computer graphics, artificial intelligence, expert systems, knowledge engineering, pattern recognition, pictorial databases, and related areas.  相似文献   

20.
Trees can be conveniently compressed with linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free string grammars which are widely used in the development of algorithms that execute directly on compressed structures (without prior decompression). It is shown that every linear straight-line context-free tree grammar can be transformed in polynomial time into a monadic (and linear) one. A tree grammar is monadic if each nonterminal uses at most one context parameter. Based on this result, polynomial time algorithms are presented for testing whether a given (i) nondeterministic tree automaton or (ii) nondeterministic tree automaton with sibling-constraints or (iii) nondeterministic tree walking automaton, accepts a tree represented by a linear straight-line context-free tree grammar. It is also shown that if tree grammars are nondeterministic or non-linear, then reducing their numbers of parameters cannot be done without an exponential blow-up in grammar size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号