首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For a Markovian decision problem in which the transition probabilities are unknown, two learning algorithms are devised from the viewpoint of asymptotic optimality. Each time the algorithms select decisions to be used on the basis of not only the estimates of the unknown probabilities but also uncertainty of them. It is shown that the algorithms are asymptotically optimal in the sense that the probability of selecting an optimal policy converges to unity.  相似文献   

2.
Many types of nonlinear classifiers have been proposed to automatically generate land-cover maps from satellite images. Some are based on the estimation of posterior class probabilities, whereas others estimate the decision boundary directly. In this paper, we propose a modular design able to focus the learning process on the decision boundary by using posterior probability estimates. To do so, we use a self-configuring architecture that incorporates specialized modules to deal with conflicting classes, and we apply a learning algorithm that focuses learning on the posterior probability regions that are critical for the performance of the decision problem stated by the user-defined misclassification costs. Moreover, we show that by filtering the posterior probability map, the impulsive noise, which is a common effect in automatic land-cover classification, can be significantly reduced. Experimental results show the effectiveness of the proposed solutions on real multi- and hyperspectral images, versus other typical approaches, that are not based on probability estimates, such as Support Vector Machines.  相似文献   

3.
In the context of classification problems, the paper analyzes the general structure of the strict sense Bayesian (SSB) cost functions, those having a unique minimum when the soft decisions are equal to the posterior class probabilities. We show that any SSB cost is essentially the sum of a generalized measure of entropy, which does not depend on the targets, and an error component. Symmetric cost functions are analyzed in detail. Our results provide a further insight on the behavior of this family of objective functions and are the starting point for the exploration of novel algorithms. Two applications are proposed. First, the use of asymmetric SSB cost functions for posterior probability estimation in non-maximum a posteriori (MAP) decision problems. Second, a novel entropy minimization principle for hybrid learning: use labeled data to minimize the cost function, and unlabeled data to minimize the corresponding entropy measure.  相似文献   

4.
In many machine learning settings, labeled examples are difficult to collect while unlabeled data are abundant. Also, for some binary classification problems, positive examples which are elements of the target concept are available. Can these additional data be used to improve accuracy of supervised learning algorithms? We investigate in this paper the design of learning algorithms from positive and unlabeled data only. Many machine learning and data mining algorithms, such as decision tree induction algorithms and naive Bayes algorithms, use examples only to evaluate statistical queries (SQ-like algorithms). Kearns designed the statistical query learning model in order to describe these algorithms. Here, we design an algorithm scheme which transforms any SQ-like algorithm into an algorithm based on positive statistical queries (estimate for probabilities over the set of positive instances) and instance statistical queries (estimate for probabilities over the instance space). We prove that any class learnable in the statistical query learning model is learnable from positive statistical queries and instance statistical queries only if a lower bound on the weight of any target concept f can be estimated in polynomial time. Then, we design a decision tree induction algorithm POSC4.5, based on C4.5, that uses only positive and unlabeled examples and we give experimental results for this algorithm. In the case of imbalanced classes in the sense that one of the two classes (say the positive class) is heavily underrepresented compared to the other class, the learning problem remains open. This problem is challenging because it is encountered in many real-world applications.  相似文献   

5.
Active Sampling for Class Probability Estimation and Ranking   总被引:1,自引:0,他引:1  
In many cost-sensitive environments class probability estimates are used by decision makers to evaluate the expected utility from a set of alternatives. Supervised learning can be used to build class probability estimates; however, it often is very costly to obtain training data with class labels. Active learning acquires data incrementally, at each phase identifying especially useful additional data for labeling, and can be used to economize on examples needed for learning. We outline the critical features of an active learner and present a sampling-based active learning method for estimating class probabilities and class-based rankings. BOOTSTRAP-LV identifies particularly informative new data for learning based on the variance in probability estimates, and uses weighted sampling to account for a potential example's informative value for the rest of the input space. We show empirically that the method reduces the number of data items that must be obtained and labeled, across a wide variety of domains. We investigate the contribution of the components of the algorithm and show that each provides valuable information to help identify informative examples. We also compare BOOTSTRAP-LV with UNCERTAINTY SAMPLING, an existing active learning method designed to maximize classification accuracy. The results show that BOOTSTRAP-LV uses fewer examples to exhibit a certain estimation accuracy and provide insights to the behavior of the algorithms. Finally, we experiment with another new active sampling algorithm drawing from both UNCERTAINTY SAMPLING and BOOTSTRAP-LV and show that it is significantly more competitive with BOOTSTRAP-LV compared to UNCERTAINTY SAMPLING. The analysis suggests more general implications for improving existing active sampling algorithms for classification.  相似文献   

6.
Reinforcement learning (RL) is concerned with the identification of optimal controls in Markov decision processes (MDPs) where no explicit model of the transition probabilities is available. We propose a class of RL algorithms which always produces stable estimates of the value function. In detail, we use "local averaging" methods to construct an approximate dynamic programming (ADP) algorithm. Nearest-neighbor regression, grid-based approximations, and trees can all be used as the basis of this approximation. We provide a thorough theoretical analysis of this approach and we demonstrate that ADP converges to a unique approximation in continuous-state average-cost MDPs. In addition, we prove that our method is consistent in the sense that an optimal approximate strategy is identified asymptotically. With regard to a practical implementation, we suggest a reduction of ADP to standard dynamic programming in an artificial finite-state MDP.  相似文献   

7.
We consider the problem of exploiting a taxonomy of propositionalized attributes in order to learn compact and robust classifiers. We introduce propositionalized attribute taxonomy guided decision tree learner (PAT-DTL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact decision trees. Since taxonomies are unavailable in most domains, we also introduce propositionalized attribute taxonomy learner (PAT-Learner) that automatically constructs taxonomy from data. PAT-DTL uses top-down and bottom-up search to find a locally optimal cut that corresponds to the literals of decision rules from data and propositionalized attribute taxonomy. PAT-Learner propositionalizes attributes and hierarchically clusters the propositionalized attributes based on the distribution of class labels that co-occur with them to generate a taxonomy. Our experimental results on UCI repository data sets show that the proposed algorithms can generate a decision tree that is generally more compact than and is sometimes comparably accurate to those produced by standard decision tree learners.  相似文献   

8.
Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce 1) feeder failure rankings, 2) cable, joint, terminator, and transformer rankings, 3) feeder Mean Time Between Failure (MTBF) estimates, and 4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or realtime, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York City’s electrical grid.  相似文献   

9.
Given free information and unlimited processing power, should decision algorithms use as much information as possible? A formal model of the decision-making environment is developed to address this question and provide conditions under which informationally frugal algorithms, without any information or processing costs whatsoever, are optimal. One cause of compression that allows optimal algorithms to rationally ignore information is inverse movement of payoffs and probabilities (e.g., high payoffs occur with low probably and low payoffs occur with high probability). If inversely related payoffs and probabilities cancel out, then predictors that correlate with payoffs and consequently condition the probabilities associated with different payoffs will drop out of the expected-payoff objective function, severing the link between information and optimal action rules. Stochastic payoff processes in which rational ignoring occurs are referred to as compressed environments, because optimal action depends on a reduced-dimension subset of the environmental parameters. This paper considers benefits and limitations of economic models versus other methods for studying links between environmental structure and the real-world success of simple decision procedures. Different methods converge on the normative proposition of ecological rationality, as opposed to axiomatic rationality based on informational efficiency and internal consistency axioms, as a superior framework for comparing the effectiveness of decision strategies and prescribing decision algorithms in application.  相似文献   

10.
There is no method to determine the optimal topology for multi-layer neural networks for a given problem. Usually the designer selects a topology for the network and then trains it. Since determination of the optimal topology of neural networks belongs to class of NP-hard problems, most of the existing algorithms for determination of the topology are approximate. These algorithms could be classified into four main groups: pruning algorithms, constructive algorithms, hybrid algorithms and evolutionary algorithms. These algorithms can produce near optimal solutions. Most of these algorithms use hill-climbing method and may be stuck at local minima. In this article, we first introduce a learning automaton and study its behaviour and then present an algorithm based on the proposed learning automaton, called survival algorithm, for determination of the number of hidden units of three layers neural networks. The survival algorithm uses learning automata as a global search method to increase the probability of obtaining the optimal topology. The algorithm considers the problem of optimization of the topology of neural networks as object partitioning rather than searching or parameter optimization as in existing algorithms. In survival algorithm, the training begins with a large network, and then by adding and deleting hidden units, a near optimal topology will be obtained. The algorithm has been tested on a number of problems and shown through simulations that networks generated are near optimal.  相似文献   

11.
When modeling a decision problem using the influence diagram framework, the quantitative part rests on two principal components: probabilities for representing the decision maker's uncertainty about the domain and utilities for representing preferences. Over the last decade, several methods have been developed for learning the probabilities from a database. However, methods for learning the utilities have only received limited attention in the computer science community.

A promising approach for learning a decision maker's utility function is to take outset in the decision maker's observed behavioral patterns, and then find a utility function which (together with a domain model) can explain this behavior. That is, it is assumed that decision maker's preferences are reflected in the behavior. Standard learning algorithms also assume that the decision maker is behavioral consistent, i.e., given a model of the decision problem, there exists a utility function which can account for all the observed behavior. Unfortunately, this assumption is rarely valid in real-world decision problems, and in these situations existing learning methods may only identify a trivial utility function. In this paper we relax this consistency assumption, and propose two algorithms for learning a decision maker's utility function from possibly inconsistent behavior; inconsistent behavior is interpreted as random deviations from an underlying (true) utility function. The main difference between the two algorithms is that the first facilitates a form of batch learning whereas the second focuses on adaptation and is particularly well-suited for scenarios where the DM's preferences change over time. Empirical results demonstrate the tractability of the algorithms, and they also show that the algorithms converge toward the true utility function for even very small sets of observations.  相似文献   


12.
Incomplete knowledge refers to situations in which a decision maker can rank the probabilities of occurrence for events of interest, but cannot specify these probabilities exactly. In this paper a Monte Carlo approach is used to investigate how physicians can arrive at revised or posterior rankings of disease probability given only rank order information about both the patient's prior probabilities of disease and the conditional probabilities of specific clinical findings for each disease. Computer-generated estimates of the expected frequencies of the posterior rankings are presented for the cases of 2, 3, and 4 disease states. The application of probability revision under conditions of incomplete knowledge to therapeutic decision making is discussed.  相似文献   

13.
神经网络增强学习的梯度算法研究   总被引:11,自引:1,他引:11  
徐昕  贺汉根 《计算机学报》2003,26(2):227-233
针对具有连续状态和离散行为空间的Markov决策问题,提出了一种新的采用多层前馈神经网络进行值函数逼近的梯度下降增强学习算法,该算法采用了近似贪心且连续可微的Boltzmann分布行为选择策略,通过极小化具有非平稳行为策略的Bellman残差平方和性能指标,以实现对Markov决策过程最优值函数的逼近,对算法的收敛性和近似最优策略的性能进行了理论分析,通过Mountain-Car学习控制问题的仿真研究进一步验证了算法的学习效率和泛化性能。  相似文献   

14.
This paper presents a hybrid model named: CLA-DE for global numerical optimization. This model is based on cellular learning automata (CLA) and differential evolution algorithm. The main idea is to learn the most promising regions of the search space using cellular learning automata. Learning automata in the CLA iteratively partition the search dimensions of a problem and learn the most admissible partitions. In order to facilitate incorporation among the CLA cells and improve their impact on each other, differential evolution algorithm is incorporated, by which communication and information exchange among neighboring cells are speeded up. The proposed model is compared with some evolutionary algorithms to demonstrate its effectiveness. Experiments are conducted on a group of benchmark functions which are commonly used in the literature. The results show that the proposed algorithm can achieve near optimal solutions in all cases which are highly competitive with the ones from the compared algorithms.  相似文献   

15.
Heger  Matthias 《Machine Learning》1996,22(1-3):197-225
Many reinforcement learning (RL) algorithms approximate an optimal value function. Once the function is known, it is easy to determine an optimal policy. For most real-world applications, however, the value function is too complex to be represented by lookup tables, making it necessary to use function approximators such as neural networks. In this case, convergence to the optimal value function is no longer guaranteed and it becomes important to know to which extent performance diminishes when one uses approximate value functions instead of optimal ones. This problem has recently been discussed in the context of expectation based Markov decision problems. Our analysis generalizes this work to minimax-based Markov decision problems, yields new results for expectation-based tasks, and shows how minimax-based and expectation based Markov decision problems relate.  相似文献   

16.
在光照和目标形变等外部条件变化的情况下,仅利用目标的单一特征难以鲁棒的跟踪目标。提出了一种基于粒子滤波后验概率分布的多特征融合跟踪算法,在粒子滤波跟踪框架下,用直方图模型表征目标的颜色和边缘特征,通过两种特征后验概率之间的"协作"与"学习"实现特征融合,各种场景的试验结果比较表明,新的融合跟踪算法比仅用单一特征跟踪、现有的多特征融合算法具有更好的稳定性和鲁棒性,特别是针对环境光照和目标背景变化较大的情况更具有优势。  相似文献   

17.
One of the difficulties in generating an optimal policy for systems planning and control by the Markov decision process is that the state transition probabilities must be known a priori. A usual approach to estimate the state transition probabilities is by using historical data. However, if the process is not completely stationary, it may be more convenient to obtain estimates of the transition probabilities by using another approach, namely, parameter adaptation by neural networks.

A significant advantage of neural network modeling of the Markovian decision problem is that the temporal nonstationary state transition probabilities can be revised by a parameter learning paradigm. The objective of this paper is to present this approach and demonstrate its applicability by modeling a finite-stage decision problem.  相似文献   


18.
Jiang W 《Neural computation》2006,18(11):2762-2776
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. We use a prior to select a limited number of candidate variables to enter the model, applying a popular method with selection indicators. We show that this approach can induce posterior estimates of the regression functions that are consistently estimating the truth, if the true regression model is sparse in the sense that the aggregated size of the regression coefficients are bounded. The estimated regression functions therefore can also produce consistent classifiers that are asymptotically optimal for predicting future binary outputs. These provide theoretical justifications for some recent empirical successes in microarray data analysis.  相似文献   

19.
This paper discusses the state estimation and optimal control problem of a class of partially‐observable stochastic hybrid systems (POSHS). The POSHS has interacting continuous and discrete dynamics with uncertainties. The continuous dynamics are given by a Markov‐jump linear system and the discrete dynamics are defined by a Markov chain whose transition probabilities are dependent on the continuous state via guard conditions. The only information available to the controller are noisy measurements of the continuous state. To solve the optimal control problem, a separable control scheme is applied: the controller estimates the continuous and discrete states of the POSHS using noisy measurements and computes the optimal control input from the state estimates. Since computing both optimal state estimates and optimal control inputs are intractable, this paper proposes computationally efficient algorithms to solve this problem numerically. The proposed hybrid estimation algorithm is able to handle state‐dependent Markov transitions and compute Gaussian‐ mixture distributions as the state estimates. With the computed state estimates, a reinforcement learning algorithm defined on a function space is proposed. This approach is based on Monte Carlo sampling and integration on a function space containing all the probability distributions of the hybrid state estimates. Finally, the proposed algorithm is tested via numerical simulations.  相似文献   

20.
We present a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first identify the primary factors that influence multimodal recognition performance by evaluating the multimodal recognition probabilities. We then develop two techniques, an estimate approach and a learning approach, which are designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号