首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The goal of this paper is to develop the foundation for a spatial navigation without objective representations. Rather than building the spatial representations on a Euclidean space, a weaker conception of space is used. A type of spatial representation is described that uses perceptual information directly to define the regions in space. By combining such regions, it is possible to derive a number of useful spatial representations such as place-fields, paths and topological maps. Compared to other methods, the representations of the present approach have the advantage that they are always grounded in the perceptual abilities of the robot.  相似文献   

2.
3.
《Applied Soft Computing》2007,7(1):353-363
For a supervised learning method, the quality of the training data or the training supervisor is very important in generating reliable neural networks. However, for real world problems, it is not always easy to obtain high quality training data sets. In this research, we propose a learning method for a neural network ensemble model that can be trained with an imperfect training data set, which is a data set containing erroneous training samples. With a competitive training mechanism, the ensemble is able to exclude erroneous samples from the training process, thus generating a reliable neural network. Through the experiment, we show that the proposed model is able to tolerate the existence of erroneous training samples in generating a reliable neural network.The ability of the neural network to tolerate the existence of erroneous samples in the training data lessens the costly task of analyzing and arranging the training data, thus increasing the usability of the neural networks for real world problems.  相似文献   

4.
Learning indistinguishability from data   总被引:1,自引:0,他引:1  
 In this paper we revisit the idea of interpreting fuzzy sets as representations of vague values. In this context a fuzzy set is induced by a crisp value and the membership degree of an element is understood as the similarity degree between this element and the crisp value that determines the fuzzy set. Similarity is assumed to be a notion of distance. This means that fuzzy sets are induced by crisp values and an appropriate distance function. This distance function can be described in terms of scaling the ordinary distance between real numbers. With this interpretation in mind, the task of designing a fuzzy system corresponds to determining suitable crisp values and appropriate scaling functions for the distance. When we want to generate a fuzzy model from data, the parameters have to be fitted to the data. This leads to an optimisation problem that is very similar to the optimisation task to be solved in objective function based clustering. We borrow ideas from the alternating optimisation schemes applied in fuzzy clustering in order to develop a new technique to determine our set of parameters from data, supporting the interpretability of the fuzzy system.  相似文献   

5.
6.
7.
基于不完全数据的TAN学习算法   总被引:1,自引:0,他引:1       下载免费PDF全文
TAN算法是一种针对复杂数据且在实际中具有极强的学习能力的有效算法,它已被广泛应用于数据挖掘、机器学习和模式识别领域。由于现实世界中的数据大多是不完全数据,研究了怎样使TAN有效地从不完全数据中学习。首先,用一种有效的方法直接从不完全数据中估计条件互信息,然后应用估计条件互信息法去扩展基本的TAN算法来处理不相关数据,最后实验比较了扩展的TAN算法和基本的TAN算法。实验结果表明,在大多数不完全数据集合上扩展的TAN算法精确性明显高于基本的TAN算法。虽然扩展的TAN算法时间复杂度高于基本的TAN算法,但在可接受范围之内。此估计条件互信息的方法能够容易地和其它技术相结合来进一步提高TAN算法的性能。  相似文献   

8.
We present ELEM2, a machine learning system that induces classification rules from a set of data based on a heuristic search over a hypothesis space. ELEM2 is distinguished from other rule induction systems in three aspects. First, it uses a new heuristtic function to guide the heuristic search. The function reflects the degree of relevance of an attribute-value pair to a target concept and leads to selection of the most relevant pairs for formulating rules. Second, ELEM2 handles inconsistent training examples by defining an unlearnable region of a concept based on the probability distribution of that concept in the training data. The unlearnable region is used as a stopping criterion for the concept learning process, which resolves conflicts without removing inconsistent examples. Third, ELEM2 employs a new rule quality measure in its post-pruning process to prevent rules from overfitting the data. The rule quality formula measures the extent to which a rule can discriminate between the positive and negative examples of a class. We describe features of ELEM2, its rule induction algorithm and its classification procedure. We report experimental results that compare ELEM2 with C4.5 and CN2 on a number of datasets.  相似文献   

9.
Previous partially supervised classification methods can partition unlabeled data into positive examples and negative examples for a given class by learning from positive labeled examples and unlabeled examples, but they cannot further group the negative examples into meaningful clusters even if there are many different classes in the negative examples. Here we proposed an automatic method to obtain a natural partitioning of mixed data (labeled data + unlabeled data) by maximizing a stability criterion defined on classification results from an extended label propagation algorithm over all the possible values of model order (or the number of classes) in mixed data. Our experimental results on benchmark corpora for word sense disambiguation task indicate that this model order identification algorithm with the extended label propagation algorithm as the base classifier outperforms SVM, a one-class partially supervised classification algorithm, and the model order identification algorithm with semi-supervised k-means clustering as the base classifier when labeled data is incomplete.  相似文献   

10.
Intelligent systems need to understand and respond to human words to enable them to interact with humans in a natural way. Several studies attempted to realize these abilities by investigating the symbol grounding problem. For example, we proposed multilayered multimodal latent Dirichlet allocation (mMLDA) to enable the formation of various concepts and inference using grounded concepts. We previously reported on the issue of connecting words to various hierarchical concepts and also proposed a simple preliminary algorithm for generating sentences. This paper proposes a novel method that enables a sensing system to verbalize an everyday scene it observes. The method uses mMLDA and Bayesian hidden Markov models (BHMM) and the proposed algorithm improves the word inference of our previous work. The advantage of our approach is that grammar learning based on BHMM not only boosts concept selection results but enables our method to process functional words. The proposed verbalization algorithm produces results that are far superior to those of previous methods. Finally, we developed a system to obtain multimodal data from human everyday activities. We evaluate language learning and sentence generation as a complete process under this realistic setting. The results demonstrate the effectiveness of our method.  相似文献   

11.
Annotating named entity recognition (NER) training corpora is a costly but necessary process for supervised NER approaches. This paper presents a general framework to generate large-scale NER training data from parallel corpora. In our method, we first employ a high performance NER system on one side of a bilingual corpus. Then, we project the named entity (NE) labels to the other side according to the word level alignments. Finally, we propose several strategies to select high-quality auto-labeled NER training data. We apply our approach to Chinese NER using an English-Chinese parallel corpus. Experimental results show that our approach can collect high-quality labeled data and can help improve Chinese NER.  相似文献   

12.
13.
Qualitative models describe relations between the observed quantities in qualitative terms. In predictive modelling, a qualitative model tells whether the output increases or decreases with the input. We describe Padé, a new method for qualitative learning which estimates partial derivatives of the target function from training data and uses them to induce qualitative models of the target function. We formulated three methods for computation of derivatives, all based on using linear regression on local neighbourhoods. The methods were empirically tested on artificial and real-world data. We also provide a case study which shows how the developed methods can be used in practice.  相似文献   

14.
Graphical models - especially probabilistic networks like Bayes networks and Markov networks - are very popular to make reasoning in high-dimensional domains feasible. Since constructing them manually can be tedious and time consuming, a large part of recent research has been devoted to learning them from data. However, if the dataset to learn from contains imprecise information in the form of sets of alternatives instead of precise values, this learning task can pose unpleasant problems. In this paper, we survey an approach to cope with these problems, which is not based on probability theory as the more common approaches like, e.g., expectation maximization, but uses the possibility theory as the underlying calculus of a graphical model. We provide semantic foundations of possibilistic graphical models, explain the rationale of possibilistic decomposition as well as the graphical representation of decompositions of possibility distributions and finally discuss the main approaches to learn possibilistic graphical models from data.  相似文献   

15.
Many studies on streaming data classification have been based on a paradigm in which a fully labeled stream is available for learning purposes. However, it is often too labor-intensive and time-consuming to manually label a data stream for training. This difficulty may cause conventional supervised learning approaches to be infeasible in many real world applications, such as credit fraud detection, intrusion detection, and rare event prediction. In previous work, Li et al. suggested that these applications be treated as Positive and Unlabeled learning problem, and proposed a learning algorithm, OcVFD, as a solution (Li et al. 2009). Their method requires only a set of positive examples and a set of unlabeled examples which is easily obtainable in a streaming environment, making it widely applicable to real-life applications. Here, we enhance Li et al.’s solution by adding three features: an efficient method to estimate the percentage of positive examples in the training stream, the ability to handle numeric attributes, and the use of more appropriate classification methods at tree leaves. Experimental results on synthetic and real-life datasets show that our enhanced solution (called PUVFDT) has very good classification performance and a strong ability to learn from data streams with only positive and unlabeled examples. Furthermore, our enhanced solution reduces the learning time of OcVFDT by about an order of magnitude. Even with 80 % of the examples in the training data stream unlabeled, PUVFDT can still achieve a competitive classification performance compared with that of VFDTcNB (Gama et al. 2003), a supervised learning algorithm.  相似文献   

16.
A wide range of combinatorial optimization algorithms have been developed for complex reasoning tasks. Frequently, no single algorithm outperforms all the others. This has raised interest in leveraging the performance of a collection of algorithms to improve performance. We show how to accomplish this using a Parallel Portfolio of Algorithms (PPA). A PPA is a collection of diverse algorithms for solving a single problem, all running concurrently on a single processor until a solution is produced. The performance of the portfolio may be controlled by assigning different shares of processor time to each algorithm. We present an effective method for finding a PPA in which the share of processor time allocated to each algorithm is fixed. Finding the optimal static schedule is shown to be an NP-complete problem for a general class of utility functions. We present bounds on the performance of the PPA over random instances and evaluate the performance empirically on a collection of 23 state-of-the-art SAT algorithms. The results show significant performance gains over the fastest individual algorithm in the collection.   相似文献   

17.
18.
Learning model trees from evolving data streams   总被引:2,自引:0,他引:2  
The problem of real-time extraction of meaningful patterns from time-changing data streams is of increasing importance for the machine learning and data mining communities. Regression in time-changing data streams is a relatively unexplored topic, despite the apparent applications. This paper proposes an efficient and incremental stream mining algorithm which is able to learn regression and model trees from possibly unbounded, high-speed and time-changing data streams. The algorithm is evaluated extensively in a variety of settings involving artificial and real data. To the best of our knowledge there is no other general purpose algorithm for incremental learning regression/model trees able to perform explicit change detection and informed adaptation. The algorithm performs online and in real-time, observes each example only once at the speed of arrival, and maintains at any-time a ready-to-use model tree. The tree leaves contain linear models induced online from the examples assigned to them, a process with low complexity. The algorithm has mechanisms for drift detection and model adaptation, which enable it to maintain accurate and updated regression models at any time. The drift detection mechanism exploits the structure of the tree in the process of local change detection. As a response to local drift, the algorithm is able to update the tree structure only locally. This approach improves the any-time performance and greatly reduces the costs of adaptation.  相似文献   

19.
Visual categorization problems, such as object classification or action recognition, are increasingly often approached using a detection strategy: a classifier function is first applied to candidate subwindows of the image or the video, and then the maximum classifier score is used for class decision. Traditionally, the subwindow classifiers are trained on a large collection of examples manually annotated with masks or bounding boxes. The reliance on time-consuming human labeling effectively limits the application of these methods to problems involving very few categories. Furthermore, the human selection of the masks introduces arbitrary biases (e.g., in terms of window size and location) which may be suboptimal for classification. We propose a novel method for learning a discriminative subwindow classifier from examples annotated with binary labels indicating the presence of an object or action of interest, but not its location. During training, our approach simultaneously localizes the instances of the positive class and learns a subwindow SVM to recognize them. We extend our method to classification of time series by presenting an algorithm that localizes the most discriminative set of temporal segments in the signal. We evaluate our approach on several datasets for object and action recognition and show that it achieves results similar and in many cases superior to those obtained with full supervision.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号