首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
Markovian architectural bias of recurrent neural networks   总被引:5,自引:0,他引:5  
In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure.  相似文献   

2.
The classical stochastic analog of the deterministic linear system in engineering is the linear system driven by white noise. As the promise of artificial neural networks in modeling nonlinear systems continues to grow, the need for a stochastic analog with quantitative foundations for analysis and synthesis will increase. This paper presents recent work in this direction, examining recurrent neural nets (RNN driven by white noise. We examine the effect of noise on the typical continuous-time RNN model. First, we perform qualitative analysis establishing uniform boundedness of moments of the neuron states over time. To enable practical application, however, it is necessary to relate these properties to useful measures that can be estimated. We thus subsequently derive bias and variance measures for the noisy RNN with respect to the corresponding deterministic RNN. This has significant practical implications, since net design is nonminimal in the sense that several nets can solve the same problem. The results allow the user to evaluate given RNN for noise performance. The designer can use these results to constrain the design space so that the design satisfies performance specifications whenever possible. An example is provided using the measures derived in this paper to predetermine the best among several RNN designs for a given problem.  相似文献   

3.
Classical Hausdorff dimension (sometimes called fractal dimension) was recently effectivized using gales (betting strategies that generalize martingales), thereby endowing various complexity classes with dimension structure and also defining the constructive dimensions of individual binary (infinite) sequences. In this paper we use gales computed by multi-account finite-state gamblers to develop the finite-state dimensions of sets of binary sequences and individual binary sequences. The theorem of Eggleston (Quart. J. Math. Oxford Ser. 20 (1949) 31–36) relating Hausdorff dimension to entropy is shown to hold for finite-state dimension, both in the space of all sequences and in the space of all rational sequences (binary expansions of rational numbers). Every rational sequence has finite-state dimension 0, but every rational number in [0,1] is the finite-state dimension of a sequence in the low-level complexity class AC0. Our main theorem shows that the finite-state dimension of a sequence is precisely the infimum of all compression ratios achievable on the sequence by information-lossless finite-state compressors.  相似文献   

4.
In this paper, we consider probabilistic context-free grammars, a class of generative devices that has been successfully exploited in several applications of syntactic pattern matching, especially in statistical natural language parsing. We investigate the problem of training probabilistic context-free grammars on the basis of distributions defined over an infinite set of trees or an infinite set of sentences by minimizing the cross-entropy. This problem has applications in cases of context-free approximation of distributions generated by more expressive statistical models. We show several interesting theoretical properties of probabilistic context-free grammars that are estimated in this way, including the previously unknown equivalence between the grammar cross-entropy with the input distribution and the so-called derivational entropy of the grammar itself. We discuss important consequences of these results involving the standard application of the maximum-likelihood estimator on finite tree and sentence samples, as well as other finite-state models such as hidden Markov models and probabilistic finite automata.  相似文献   

5.
There has been a lot of interest in the use of discrete-time recurrent neural nets (DTRNN) to learn finite-state tasks, with interesting results regarding the induction of simple finite-state machines from input-output strings. Parallel work has studied the computational power of DTRNN in connection with finite-state computation. This article describes a simple strategy to devise stable encodings of finite-state machines in computationally capable discrete-time recurrent neural architectures with sigmoid units and gives a detailed presentation on how this strategy may be applied to encode a general class of finite-state machines in a variety of commonly used first- and second-order recurrent neural networks. Unlike previous work that either imposed some restrictions to state values or used a detailed analysis based on fixed-point attractors, our approach applies to any positive, bounded, strictly growing, continuous activation function and uses simple bounding criteria based on a study of the conditions under which a proposed encoding scheme guarantees that the DTRNN is actually behaving as a finite-state machine.  相似文献   

6.
Neural networks do not readily provide an explanation of the knowledge stored in their weights as part of their information processing. Until recently, neural networks were considered to be black boxes, with the knowledge stored in their weights not readily accessible. Since then, research has resulted in a number of algorithms for extracting knowledge in symbolic form from trained neural networks. This article addresses the extraction of knowledge in symbolic form from recurrent neural networks trained to behave like deterministic finite-state automata (DFAs). To date, methods used to extract knowledge from such networks have relied on the hypothesis that networks' states tend to cluster and that clusters of network states correspond to DFA states. The computational complexity of such a cluster analysis has led to heuristics that either limit the number of clusters that may form during training or limit the exploration of the space of hidden recurrent state neurons. These limitations, while necessary, may lead to decreased fidelity, in which the extracted knowledge may not model the true behavior of a trained network, perhaps not even for the training set. The method proposed here uses a polynomial time, symbolic learning algorithm to infer DFAs solely from the observation of a trained network's input-output behavior. Thus, this method has the potential to increase the fidelity of the extracted knowledge.  相似文献   

7.
方冰  韩冰 《控制与决策》2023,38(2):546-554
针对已有概率犹豫模糊熵测度构造复杂、区分能力弱等缺点,提出一种混合型概率犹豫模糊熵测度.混合型熵测度能够综合反映概率犹豫模糊元所具有的个体不确定性和整体不确定性,具有结构简单、物理意义明确、区分能力强等优势.在概率犹豫模糊元规范化的基础上,基于混合型熵测度的构造理念所设计的混合型交叉熵测度,能够克服已有交叉熵测度的设计缺陷,综合反映两个概率犹豫模糊元之间的个体区分度和整体区分度,且具有自然的对称性.基于混合型熵测度和交叉熵测度,进一步设计概率犹豫模糊环境下的多属性决策方法,并将其应用于无人机集群作战效能评估.数值和理论分析结果表明,所提出的混合型概率犹豫模糊熵和交叉熵测度能够成对设计,互为印证,具有广泛的应用前景.  相似文献   

8.
Recently, a number of authors have explored the use of recursive neural nets (RNN) for the adaptive processing of trees or tree-like structures. One of the most important language-theoretical formalizations of the processing of tree-structured data is that of deterministic finite-state tree automata (DFSTA). DFSTA may easily be realized as RNN using discrete-state units, such as the threshold linear unit. A recent result by J. Sima (1997) shows that any threshold linear unit operating on binary inputs can be implemented in an analog unit using a continuous activation function and bounded real inputs. The constructive proof finds a scaling factor for the weights and reestimates the bias accordingly. We explore the application of this result to simulate DFSTA in sigmoid RNN (that is, analog RNN using monotonically growing activation functions) and also present an alternative scheme for one-hot encoding of the input that yields smaller weight values, and therefore works at a lower saturation level  相似文献   

9.
This work presents a statistical model to recognize pen-based music compositions using stroke recognition algorithms and finite-state machines. The series of strokes received as input is mapped onto a stochastic representation, which is combined with a formal language that describes musical symbols in terms of stroke primitives. Then, a Probabilistic Finite-State Automaton is obtained, which defines probabilities over the set of musical sequences. This model is eventually crossed with a semantic language to avoid sequences that does not make musical sense. Finally, a decoding strategy is applied in order to output a hypothesis about the musical sequence actually written. Comprehensive experimentation with several decoding algorithms, stroke similarity measures and probability density estimators are tested and evaluated following different metrics of interest. Results found have shown the goodness of the proposed model, obtaining competitive performances in all metrics and scenarios considered.  相似文献   

10.
The stochastic control problem is defined in terms of state probability. Clearly, the system is designed in such a manner that its state probability tracks the desired state probability of the reference system. The tracking criterion to be minimized is the path cross-entropy (or relative entropy or Kullback entropy) of the two probability density functions, and the problem then turns out to be a distributed parameter one in which the state dynamical equation is the Fokker-Planck equation. The explicit solution of the conjugate equation is obtained by using expansions in Hermite's polynomials. In the special case of neighbouring-optimal control, a slight extension of the LQG approach is proposed, by using a cost function which is a weighted combination of the path cross-entropy and a control quadratic term. Future improvements are discussed.  相似文献   

11.
Presents a method of generating test sequences for concurrent programs and communication protocols that are modeled as communicating nondeterministic finite-state machines (CNFSMs). A conformance relation, called trace-equivalence, is defined within this model, serving as a guide to test generation. A test generation method for a single nondeterministic finite-state machine (NFSM) is developed, which is an improved and generalized version of the Wp-method that generates test sequences only for deterministic finite-state machines. It is applicable to both nondeterministic and deterministic finite-state machines. When applied to deterministic finite-state machines, it yields usually smaller test suites with full fault coverage than the existing methods that also provide full fault coverage, provided that the number of states in implementation NFSMs are bounded by a known integer. For a system of CNFSMs, the test sequences are generated in the following manner: a system of CNFSMs is first reduced into a single NFSM by reachability analysis; then the test sequences are generated from the resulting NFSM using the generalized Wp-method  相似文献   

12.
In this work, physics-based recurrent neural network (RNN) modeling approaches are proposed for a general class of nonlinear dynamic process systems to improve prediction accuracy by incorporating a priori process knowledge. Specifically, a hybrid modeling method is first introduced to integrate first-principles models and RNN models. Subsequently, a partially-connected RNN modeling method that designs the RNN structure based on a priori structural process knowledge, and a weight-constrained RNN modeling method that employs weight constraints in the optimization problem of the RNN training process are developed. The proposed physics-based RNN models are utilized in model predictive controllers and applied to a chemical process network example to demonstrate their improved approximation performance compared to the fully-connected RNN model that is developed as a black box model.  相似文献   

13.
混沌二进制序列的伪随机性和复杂性分析   总被引:1,自引:0,他引:1  
分析和讨论了由经典的Lorenz混沌系统和Chebyshev映射所生成的二进制序列的伪随机性和复杂性,采用T.Kohda混沌二进制量化算法,将混沌系统所产生的实数序列转换为相应的二进制序列;从统计检验、自相关性、频谱、Lempel-Ziv复杂度和近似熵等多方面对序列的伪随机性和复杂性进行定量分析。统计分析结果表明对由混沌系统所产生的有限二进制序列逼近Lempel-Ziv意义的随机序列,它具有较高的伪随机性、复杂性和非周期性,但是序列的伪随机性和复杂性并不随序列长度的增加而提高,在近似熵评价指标中呈显出降低的趋势。同时,作为伪随机源,Lorenz混沌系统略比Chebyshev映射好。  相似文献   

14.
It has been shown that if a recurrent neural network (RNN) learns to process a regular language, one can extract a finite-state machine (FSM) by treating regions of phase-space as FSM states. However, it has also been shown that one can construct an RNN to implement Turing machines by using RNN dynamics as counters. But how does a network learn languages that require counting? Rodriguez, Wiles, and Elman (1999) showed that a simple recurrent network (SRN) can learn to process a simple context-free language (CFL) by counting up and down. This article extends that to show a range of language tasks in which an SRN develops solutions that not only count but also copy and store counting information. In one case, the network stores information like an explicit storage mechanism. In other cases, the network stores information more indirectly in trajectories that are sensitive to slight displacements that depend on context. In this sense, an SRN can learn analog computation as a set of interdependent counters. This demonstrates how SRNs may be an alternative psychological model of language or sequence processing.  相似文献   

15.
A simple associationist neural network learns to factor abstract rules (i.e., grammars) from sequences of arbitrary input symbols by inventing abstract representations that accommodate unseen symbol sets as well as unseen but similar grammars. The neural network is shown to have the ability to transfer grammatical knowledge to both new symbol vocabularies and new grammars. Analysis of the state-space shows that the network learns generalized abstract structures of the input and is not simply memorizing the input strings. These representations are context sensitive, hierarchical, and based on the state variable of the finite-state machines that the neural network has learned. Generalization to new symbol sets or grammars arises from the spatial nature of the internal representations used by the network, allowing new symbol sets to be encoded close to symbol sets that have already been learned in the hidden unit space of the network. The results are counter to the arguments that learning algorithms based on weight adaptation after each exemplar presentation (such as the long term potentiation found in the mammalian nervous system) cannot in principle extract symbolic knowledge from positive examples as prescribed by prevailing human linguistic theory and evolutionary psychology.  相似文献   

16.
One-counter nets are finite-state machines operating on a variable (counter), which ranges over the natural numbers. Each transition can increase or decrease the counter’s value, and a decrease is possible only if the result is nonnegative; hence, zero testing is not allowed. The class of one-counter nets is equivalent in terms of its expressive power to the class of Petri nets with one unbounded place and to the class of pushdown automata where the stack alphabet contains one symbol. We present a specific method of approximating the largest bisimulation of a one-counter net based on single-periodic arithmetic and the notion of stratified bisimulation.  相似文献   

17.
Petri网的符号ZBDD可达树分析技术   总被引:2,自引:0,他引:2  
Petri网是一种适合于并发系统建模、分析和控制的图形工具.可达树是Petri网分析的典型技术之一,它通过标识向量集合表征系统的状态空间,组合复杂性严重制约了该分析技术可处理系统问题的规模.零压缩决策图(Zero-Suppressed Binary Decision Diagrams,ZBDD)是一种新型的数据结构,是表示和处理稀疏向量集合的一种有效技术.文章基于Petri网町达标识向量的稀疏特征,给出了Petri网分析的符号ZBDD技术,该技术通过对标识向量(状态)的布尔向量表示、可达标识向最(状态)的符号ZBDD生成,实现Petri网可达状态空间的高效符号操作和紧凑符号表示.实验表明,基于ZBDD的符号可达性分析算法能够有效处理较大规模Petri网问题.  相似文献   

18.
Financial volatility trading using recurrent neural networks   总被引:2,自引:0,他引:2  
We simulate daily trading of straddles on financial indexes. The straddles are traded based on predictions of daily volatility differences in the indexes. The main predictive models studied are recurrent neural nets (RNN). Such applications have often been studied in isolation. However, due to the special character of daily financial time-series, it is difficult to make full use of RNN representational power. Recurrent networks either tend to overestimate noisy data, or behave like finite-memory sources with shallow memory; they hardly beat classical fixed-order Markov models. To overcome data nonstationarity, we use a special technique that combines sophisticated models fitted on a larger data set, with a fixed set of simple-minded symbolic predictors using only recent inputs. Finally, we compare our predictors with the GARCH family of econometric models designed to capture time-dependent volatility structure in financial returns. GARCH models have been used to trade volatility. Experimental results show that while GARCH models cannot generate any significantly positive profit, by careful use of recurrent networks or Markov models, the market makers can generate a statistically significant excess profit, but then there is no reason to prefer RNN over much more simple and straightforward Markov models. We argue that any report containing RNN results on financial tasks should be accompanied by results achieved by simple finite-memory sources combined with simple techniques to fight nonstationarity in the data.  相似文献   

19.
Thereachability, deadlok detection andunboundedness detection problems are considered for the class ofcyclic one-type message networks of communicating finite state machines. We show that all the three problems are effectively solvable by (a) constructing canonical execution event sequences which belong to a context-free language, and (b) showing that the reachability sets are semilinear. Our algorithms have polynomial complexity in terms of size of a global structure of a network, called theshuffle-product. The relationships between general Petri nets and the class of communicating finite state machines considered here are also explored.Supported in part by NSF CCR-9004121  相似文献   

20.
Jump Markov linear systems are linear systems whose parameters evolve with time according to a finite-state Markov chain. Given a set of observations, our aim is to estimate the states of the finite-state Markov chain and the continuous (in space) states of the linear system. The computational cost in computing conditional mean or maximum a posteriori (MAP) state estimates of the Markov chain or the state of the jump Markov linear system grows exponentially in the number of observations. We present three globally convergent algorithms based on stochastic sampling methods for state estimation of jump Markov linear systems. The cost per iteration is linear in the data length. The first proposed algorithm is a data augmentation (DA) scheme that yields conditional mean state estimates. The second proposed scheme is a stochastic annealing (SA) version of DA that computes the joint MAP sequence estimate of the finite and continuous states. Finally, a Metropolis-Hastings DA scheme based on SA is designed to yield the MAP estimate of the finite-state Markov chain. Convergence results of the three above-mentioned stochastic algorithms are obtained. Computer simulations are carried out to evaluate the performances of the proposed algorithms. The problem of estimating a sparse signal developing from a neutron sensor based on a set of noisy data from a neutron sensor and the problem of narrow-band interference suppression in spread spectrum code-division multiple-access (CDMA) systems are considered  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号