首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
岳博  焦李成 《计算机学报》2000,23(11):1160-1165
弧的删除是一种对Bayes网络模型进行近似的方法。文中以Kullback-Leibler偏差作为近似网络和原网络概率分布误差的测度,给出了近似网络在此测度意义下的最优参数。同时,也给出了通过对原网络删除多条弧进行近似的启发式算法,当给定一个误差上界时,可以使用此算法寻找满足误差要求的近似网络。  相似文献   

2.
Auer  Peter  Long  Philip M.  Maass  Wolfgang  Woeginger  Gerhard J. 《Machine Learning》1995,18(2-3):187-230
The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range {0, 1}. Much less is known about the theory of learning functions with a larger range such as or . In particular relatively few results exist about the general structure of common models for function learning, and there are only very few nontrivial function classes for which positive learning results have been exhibited in any of these models.We introduce in this paper the notion of a binary branching adversary tree for function learning, which allows us to give a somewhat surprising equivalent characterization of the optimal learning cost for learning a class of real-valued functions (in terms of a max-min definition which does not involve any learning model).Another general structural result of this paper relates the cost for learning a union of function classes to the learning costs for the individual function classes.Furthermore, we exhibit an efficient learning algorithm for learning convex piecewise linear functions from d into . Previously, the class of linear functions from d into was the only class of functions with multidimensional domain that was known to be learnable within the rigorous framework of a formal model for online learning.Finally we give a sufficient condition for an arbitrary class of functions from into that allows us to learn the class of all functions that can be written as the pointwise maximum ofk functions from . This allows us to exhibit a number of further nontrivial classes of functions from into for which there exist efficient learning algorithms.  相似文献   

3.
This paper presents a polynomial-time algorithm for inferring a probabilistic generalization of the class of read-once Boolean formulas over the usual basis {AND, OR, NOT}. The algorithm effectively infers a good approximation of the target formula when provided with random examples which are chosen according to anyproduct distribution, i.e., any distribution in which the setting of each input bit is chosen independently of the settings of the other bits. Since the class of formulas considered includes ordinary read-once Boolean formulas, our result shows that such formulas are PAC learnable (in the sense of Valiant) against any product distribution (for instance, against the uniform distribution). Further, this class of probabilistic formulas includes read-once formulas whose behavior has been corrupted by large amounts of random noise. Such noise may affect the formula's output (misclassification noise), the input bits (attribute noise), or it may affect the behavior of individual gates of the formula. Thus, in this setting, we show that read-once formula's can be inferred (approximately), despite large amounts of noise affecting the formula's behavior.  相似文献   

4.
We characterize the classes of languages over finite alphabets which may be described by P automata, i.e., accepting P systems with communication rules only. Motivated by properties of natural computing systems, and the actual behavior of P automata, we study computational complexity classes with a certain restriction on the use of the available workspace in the course of computations and relate these to the language classes described by P automata. We prove that if the rules of the P system are applied sequentially, then the accepted language class is strictly included in the class of languages accepted by one-way Turing machines with a logarithmically bounded workspace, and if the rules are applied in the maximally parallel manner, then the class of context-sensitive languages is obtained.  相似文献   

5.
朱凯  毋国庆  吴理华  袁梦霆 《软件学报》2019,30(7):2033-2051
自动机的重置序列也称为同步序列,具有以下特性:有限自动机通过运行重置序列w,可从任意一个未知的或无法观测到的状态q0到达某个特定状态qw.这仅依赖于w,而与开始运行w时的状态q0无关.这一特性可用于部分可观察的复杂系统的自动恢复,而无需重启,甚至有时不能重启.基于此,重置问题自出现以来便得到关注和持续研究.最近几年,它被扩展到可以描述诸如分布式、嵌入式实时系统等复杂系统的无限状态模型上,比如时间自动机和寄存器自动机等.以时间自动机的重置问题的计算复杂性为研究对象,发现重置问题与可达性问题有着紧密的联系.主要贡献是:(1)利用时间自动机可达性问题的最新成果,完善完全的确定的时间自动机重置问题的计算复杂性结论;(2)对部分规约的确定的时间自动机,研究得出,即使在输入字母表大小减至2的情况下,其复杂性仍是PSPACE-完全的;特别地,在单时钟情况下是NLOGSPACE-完全的;(3)对完全的非确定的时间自动机,研究得出其Di-可重置问题(i=1,2,3)是不可判定的,其重置问题与非确定的寄存器自动机重置问题在指数时间可以相互归约,通过证明指数时间归约相对高复杂性类具有封闭性,利用非确定的寄存器自动机的结论得出单时钟的时间自动机的重置问题是Ackermann-完全的、限界的重置问题是NEXPTIME-完全的.这些复杂性结论,说明关于时间自动机的重置问题大都是难解的,一方面,为时间系统的可重置性的检测和求解奠定坚实的理论基础,另一方面,为以后寻找具有高效算法的特殊结构的时间系统(即具有高效算法的问题子类)给予理论指导.  相似文献   

6.
Long  Philip M. 《Machine Learning》1999,37(3):337-354
We show that a bound on the rate of drift of the distribution generating the examples is sufficient for agnostic learning to relative accuracy , where c > 0 is a constant; this matches a known necessary condition to within a constant factor. We establish a sufficient condition for the realizable case, also matching a known necessary condition to within a constant factor. We provide a relatively simple proof of a bound of + on the sample complexity of agnostic learning in a fixed environment.  相似文献   

7.
Aizenstein  Howard  Pitt  Leonard 《Machine Learning》1995,19(3):183-208
We present two related results about the learnability of disjunctive normal form (DNF) formulas. First we show that a common approach for learning arbitrary DNF formulas requires exponential time. We then contrast this with a polynomial time algorithm for learning most (rather than all) DNF formulas. A natural approach for learning boolean functions involves greedily collecting the prime implicants of the hidden function. In a seminal paper of learning theory, Valiant demonstrated the efficacy of this approach for learning monotone DNF, and suggested this approach for learning DNF. Here we show that no algorithm using such an approach can learn DNF in polynomial time. We show this by constructing a counterexample DNF formula which would force such an algorithm to take exponential time. This counterexample seems to capture much of what makes DNF hard to learn, and thus is useful to consider when evaluating the run-time of a proposed DNF learning algorithm. This hardness result, as well as other hardness results for learning DNF, relies on the construction of particular hard-to-learn formulas, formulas that appear to be relatively rare. This raises the question of whether most DNF formulas are learnable. For certain natural definitions of most DNF formulas, we answer this question affirmatively.  相似文献   

8.
A consumer entering a new bookstore can face more than 250,000alternatives. The efficiency of compensatory and noncompensatory decisionrulesfor finding a preferred item depends on the efficiency of their associatedinformation operators. At best, item-by-item information operators lead tolinear computational complexity; set information operators, on the other hand,can lead to constant complexity. We perform an experiment demonstrating thatsubjects are approximately rational in selecting between sublinear and linearrules. Many markets are organized by attributes that enable consumers toemploya set-selection-by-aspect rule using set information operations. In cyberspacedecision rules are encoded as decision aids.  相似文献   

9.
陈亚瑞 《计算机科学》2013,40(2):253-256,288
图模型概率推理的主要任务是通过对联合概率分布进行变量求和来计算配分函数、变量边缘概率分布、条件 概率分布等。图模型概率推理计算复杂性及近似概率推理的计算复杂性是一重要的理论问题,也是设计概率推理算 法和近似概率推理算法的理论基础。研究了Ising图模型概率推理的计算复杂性,包括概率推理的难解性及不可近似 性。具体地,通过构建#2 SA"I'问题到Icing图模型概率推理问题的多项式时间计数归约,证明在一般 Ising图模型上 计算配分函数、变量边缘概率分布、条件概率分布的概率推理问题是#P难的,同时证明Icing图模型近似概率推理问 题是NP难的,即一般Icing图模型上的概率推理问题是难解且不可近似的。  相似文献   

10.
This paper is concerned with the estimation of the dominant orientation of textured patches that appear in a number of images (remote sensing, biology or natural sciences for instance). It is based on the maximization of a criterion that deals with the coefficients enclosed in the different bands of a wavelet decomposition of the original image. More precisely, we search for the orientation that best concentrates the energy of the coefficients in a single direction. To compare the wavelet coefficients between the different bands, we use the Kullback-Leibler divergence on their distribution, this latter being assumed to behave like a Generalized Gaussian Density. The space-time localization of the wavelet transform allows to deal with any polygon that may be contained in a single image. This is of key importance when one works with (non-rectangular) segmented objects. We have applied the same methodology but using other criteria to compare the distributions, in order to highlight the benefit of the Kullback-Leibler divergence. In addition, the methodology is validated on synthetic and real situations and compared with a state-of-the-art approach devoted to orientation estimation.  相似文献   

11.
Kearns  Michael  Sebastian Seung  H. 《Machine Learning》1995,18(2-3):255-276
We introduce a new formal model in which a learning algorithm must combine a collection of potentially poor but statistically independent hypothesis functions in order to approximate an unknown target function arbitrarily well. Our motivation includes the question of how to make optimal use of multiple independent runs of a mediocre learning algorithm, as well as settings in which the many hypotheses are obtained by a distributed population of identical learning agents.  相似文献   

12.
This article presents an overview of Probabilistic Automata (PA) and discrete Hidden Markov Models (HMMs), and aims at clarifying the links between them. The first part of this work concentrates on probability distributions generated by these models. Necessary and sufficient conditions for an automaton to define a probabilistic language are detailed. It is proved that probabilistic deterministic automata (PDFA) form a proper subclass of probabilistic non-deterministic automata (PNFA). Two families of equivalent models are described next. On one hand, HMMs and PNFA with no final probabilities generate distributions over complete finite prefix-free sets. On the other hand, HMMs with final probabilities and probabilistic automata generate distributions over strings of finite length. The second part of this article presents several learning models, which formalize the problem of PA induction or, equivalently, the problem of HMM topology induction and parameter estimation. These learning models include the PAC and identification with probability 1 frameworks. Links with Bayesian learning are also discussed. The last part of this article presents an overview of induction algorithms for PA or HMMs using state merging, state splitting, parameter pruning and error-correcting techniques.  相似文献   

13.
Ron  Dana  Singer  Yoram  Tishby  Naftali 《Machine Learning》1996,25(2-3):117-149
We propose and analyze a distribution learning algorithm for variable memory length Markov processes. These processes can be described by a subclass of probabilistic finite automata which we name Probabilistic Suffix Automata (PSA). Though hardness results are known for learning distributions generated by general probabilistic automata, we prove that the algorithm we present can efficiently learn distributions generated by PSAs. In particular, we show that for any target PSA, the KL-divergence between the distribution generated by the target and the distribution generated by the hypothesis the learning algorithm outputs, can be made small with high confidence in polynomial time and sample complexity. The learning algorithm is motivated by applications in human-machine interaction. Here we present two applications of the algorithm. In the first one we apply the algorithm in order to construct a model of the English language, and use this model to correct corrupted text. In the second application we construct a simple stochastic model for E.coli DNA.  相似文献   

14.
Schmitt  Michael 《Machine Learning》1999,37(2):131-141
A neural network is said to be nonoverlapping if there is at most one edge outgoing from each node. We investigate the number of examples that a learning algorithm needs when using nonoverlapping neural networks as hypotheses. We derive bounds for this sample complexity in terms of the Vapnik-Chervonenkis dimension. In particular, we consider networks consisting of threshold, sigmoidal and linear gates. We show that the class of nonoverlapping threshold networks and the class of nonoverlapping sigmoidal networks on n inputs both have Vapnik-Chervonenkis dimension (nlog n). This bound is asymptotically tight for the class of nonoverlapping threshold networks. We also present an upper bound for this class where the constants involved are considerably smaller than in a previous calculation. Finally, we argue that the Vapnik-Chervonenkis dimension of nonoverlapping threshold or sigmoidal networks cannot become larger by allowing the nodes to compute linear functions. This sheds some light on a recent result that exhibited neural networks with quadratic Vapnik-Chervonenkis dimension.  相似文献   

15.
Recently, the complexity control of dynamic neural models has gained significant attention. The performance of such a process depends highly on the applied definition of model complexity. On the other hand, the learning theory creates a framework to assess the learning properties of models. These properties include the required size of the training samples as well as the statistical confidence over the model. In this Letter, we apply the learning properties of two families of Radial Basis Function Networks (RBFN's) to introduce new complexity measures that reflect the learning properties of such neural model. Then, based on these complexity terms we define cost functions, which provide a balance between the training and testing performances of the model.  相似文献   

16.
The -automaton is the weakest form of the nondeterministic version of the restarting automaton that was introduced by Jančar et al. to model the so-called analysis by reduction. Here it is shown that the class ℒ(R) of languages that are accepted by -automata is incomparable under set inclusion to the class of Church-Rosser languages and to the class of growing context-sensitive languages. In fact this already holds for the class of languages that are accepted by 2-monotone -automata. In addition, we prove that already the latter class contains -complete languages, showing that already the 2-monotone -automaton has a surprisingly large expressive power. The results of this paper have been announced at DLT 2004 in Auckland, New Zealand. This work was mainly carried out while T. Jurdziński was visiting the University of Kassel, supported by a grant from the Deutsche Forschungsgemeinschaft (DFG). F. Mráz and M. Plátek were partially supported by the Grant Agency of the Czech Republic under Grant-No. 201/04/2102 and by the program ‘Information Society’ under project 1ET100300517. F. Mráz was also supported by the Grant Agency of Charles University in Prague under Grant-No. 358/2006/A-INF/MFF.  相似文献   

17.
Learning Changing Concepts by Exploiting the Structure of Change   总被引:1,自引:0,他引:1  
This paper examines learning problems in which the target function is allowed to change. The learner sees a sequence of random examples, labelled according to a sequence of functions, and must provide an accurate estimate of the target function sequence. We consider a variety of restrictions on how the target function is allowed to change, including infrequent but arbitrary changes, sequences that correspond to slow walks on a graph whose nodes are functions, and changes that are small on average, as measured by the probability of disagreements between consecutive functions. We first study estimation, in which the learner sees a batch of examples and is then required to give an accurate estimate of the function sequence. Our results provide bounds on the sample complexity and allowable drift rate for these problems. We also study prediction, in which the learner must produce online a hypothesis after each labelled example and the average misclassification probability over this hypothesis sequence should be small. Using a deterministic analysis in a general metric space setting, we provide a technique for constructing a successful prediction algorithm, given a successful estimation algorithm. This leads to sample complexity and drift rate bounds for the prediction of changing concepts.  相似文献   

18.
Many process model analysis techniques rely on the accurate analysis of the natural language contents captured in the models’ activity labels. Since these labels are typically short and diverse in terms of their grammatical style, standard natural language processing tools are not suitable to analyze them. While a dedicated technique for the analysis of process model activity labels was proposed in the past, it suffers from considerable limitations. First of all, its performance varies greatly among data sets with different characteristics and it cannot handle uncommon grammatical styles. What is more, adapting the technique requires in-depth domain knowledge. We use this paper to propose a machine learning-based technique for activity label analysis that overcomes the issues associated with this rule-based state of the art. Our technique conceptualizes activity label analysis as a tagging task based on a Hidden Markov Model. By doing so, the analysis of activity labels no longer requires the manual specification of rules. An evaluation using a collection of 15,000 activity labels demonstrates that our machine learning-based technique outperforms the state of the art in all aspects.  相似文献   

19.
In this paper we illustrate the optimal filtering of log returns of commodity prices in which both the mean and volatility are modulated by a hidden Markov chain with finite state space. The optimal estimate of the Markov chain and the parameters of the price model are given in terms of discrete-time recursive filters. We provide an application on a set of high frequency gold price data for the period 1973-2006 and analyse the h-step ahead price predictions against the Diebold-Kilian metric. Within the modelling framework where the mean and volatility are switching regimes, our findings suggest that a two-state hidden Markov model is sufficient to describe the dynamics of the data and the gold price is predictable up to a certain extent in the short term but almost impossible to predict in the long term. The proposed model is also benchmarked with ARCH and GARCH models with respect to price predictability and forecasting errors.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号