首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 156 毫秒
1.
We describe a new incremental algorithm for training linear threshold functions: the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctly with the maximum margin. It is known that such a maximum-margin hypothesis can be computed by minimizing the length of the weight vector subject to a number of linear constraints. ROMMA works by maintaining a relatively simple relaxation of these constraints that can be efficiently updated. We prove a mistake bound for ROMMA that is the same as that proved for the perceptron algorithm. Our analysis implies that the maximum-margin algorithm also satisfies this mistake bound; this is the first worst-case performance guarantee for this algorithm. We describe some experiments using ROMMA and a variant that updates its hypothesis more aggressively as batch algorithms to recognize handwritten digits. The computational complexity and simplicity of these algorithms is similar to that of perceptron algorithm, but their generalization is much better. We show that a batch algorithm based on aggressive ROMMA converges to the fixed threshold SVM hypothesis.  相似文献   

2.
Traditional nonlinear manifold learning methods have achieved great success in dimensionality reduction and feature extraction, most of which are batch modes. However, if new samples are observed, the batch methods need to be calculated repeatedly, which is computationally intensive, especially when the number or dimension of the input samples are large. This paper presents incremental learning algorithms for Laplacian eigenmaps, which computes the low-dimensional representation of data set by optimally preserving local neighborhood information in a certain sense. Sub-manifold analysis algorithm together with an alternative formulation of linear incremental method is proposed to learn the new samples incrementally. The locally linear reconstruction mechanism is introduced to update the existing samples’ embedding results. The algorithms are easy to be implemented and the computation procedure is simple. Simulation results testify the efficiency and accuracy of the proposed algorithms.  相似文献   

3.
目的在多标签有监督学习框架中,构建具有较强泛化性能的分类器需要大量已标注训练样本,而实际应用中已标注样本少且获取代价十分昂贵。针对多标签图像分类中已标注样本数量不足和分类器再学习效率低的问题,提出一种结合主动学习的多标签图像在线分类算法。方法基于min-max理论,采用查询最具代表性和最具信息量的样本挑选策略主动地选择待标注样本,且基于KKT(Karush-Kuhn-Tucker)条件在线地更新多标签图像分类器。结果在4个公开的数据集上,采用4种多标签分类评价指标对本文算法进行评估。实验结果表明,本文采用的样本挑选方法比随机挑选样本方法和基于间隔的采样方法均占据明显优势;当分类器达到相同或相近的分类准确度时,利用本文的样本挑选策略选择的待标注样本数目要明显少于采用随机挑选样本方法和基于间隔的采样方法所需查询的样本数。结论本文算法一方面可以减少获取已标注样本所需的人工标注代价;另一方面也避免了传统的分类器重新训练时利用所有数据所产生的学习效率低下的问题,达到了当新数据到来时可实时更新分类器的目的。  相似文献   

4.
In this paper, we introduce a new algorithm for incremental learning of a specific form of Takagi–Sugeno fuzzy systems proposed by Wang and Mendel in 1992. The new data-driven online learning approach includes not only the adaptation of linear parameters appearing in the rule consequents, but also the incremental learning of premise parameters appearing in the membership functions (fuzzy sets), together with a rule learning strategy in sample mode. A modified version of vector quantization is exploited for rule evolution and an incremental learning of the rules' premise parts. The modifications include an automatic generation of new clusters based on the nature, distribution, and quality of new data and an alternative strategy for selecting the winning cluster (rule) in each incremental learning step. Antecedent and consequent learning are connected in a stable manner, meaning that a convergence toward the optimal parameter set in the least-squares sense can be achieved. An evaluation and a comparison to conventional batch methods based on static and dynamic process models are presented for high-dimensional data recorded at engine test benches and at rolling mills. For the latter, the obtained data-driven fuzzy models are even compared with an analytical physical model. Furthermore, a comparison with other evolving fuzzy systems approaches is carried out based on nonlinear dynamic system identification tasks and a three-input nonlinear function approximation example.   相似文献   

5.
Cluster analysis is used to explore structure in unlabeled batch data sets in a wide range of applications. An important part of cluster analysis is validating the quality of computationally obtained clusters. A large number of different internal indices have been developed for validation in the offline setting. However, this concept cannot be directly extended to the online setting because streaming algorithms do not retain the data, nor maintain a partition of it, both needed by batch cluster validity indices. In this paper, we develop two incremental versions (with and without forgetting factors) of the Xie-Beni and Davies-Bouldin validity indices, and use them to monitor and control two streaming clustering algorithms (sk-means and online ellipsoidal clustering), In this context, our new incremental validity indices are more accurately viewed as performance monitoring functions. We also show that incremental cluster validity indices can send a distress signal to online monitors when evolving structure leads an algorithm astray. Our numerical examples indicate that the incremental Xie-Beni index with a forgetting factor is superior to the other three indices tested.  相似文献   

6.
针对目前室内指纹定位算法存在实时性差、对动态环境适应性不足的问题,提出一种新的基于半监督极限学习机的定位算法.该算法首先通过半监督极限学习机建立初始化位置估计模型,然后利用新增的半标记数据对原定位模型进行动态调整,最后为新增训练数据分配合适惩罚权重,使模型具有时效机制.仿真结果表明,该定位算法在保证定位实时性的同时提高了对动态环境的适应性.  相似文献   

7.
一种新的增量决策树算法   总被引:1,自引:0,他引:1  
对于数据增加迅速的客户行为分析、Web日志分析、网络入侵检测等在线分类系统来说,如何快速适应新增样本是确保其分类正确和可持续运行的关键。该文提出了一种新的适应数据增量的决策树算法,该算法同贝叶斯方法相结合,在原有决策树的基础上利用新增样本迅速训练出新的决策树。实验结果表明,提出的算法可以较好的解决该问题,与重新构造决策树相比,它的时间开销更少,且具有更高的分类准确率,更适用于在线分类系统。  相似文献   

8.
Lazy Learning of Bayesian Rules   总被引:19,自引:0,他引:19  
The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and generates a local naive Bayesian classifier at each leaf. The tests leading to a leaf can alleviate attribute inter-dependencies for the local naive Bayesian classifier. However, Bayesian tree learning still suffers from the small disjunct problem of tree learning. While inferred Bayesian trees demonstrate low average prediction error rates, there is reason to believe that error rates will be higher for those leaves with few training examples. This paper proposes the application of lazy learning techniques to Bayesian tree induction and presents the resulting lazy Bayesian rule learning algorithm, called LBR. This algorithm can be justified by a variant of Bayes theorem which supports a weaker conditional attribute independence assumption than is required by naive Bayes. For each test example, it builds a most appropriate rule with a local naive Bayesian classifier as its consequent. It is demonstrated that the computational requirements of LBR are reasonable in a wide cross-section of natural domains. Experiments with these domains show that, on average, this new algorithm obtains lower error rates significantly more often than the reverse in comparison to a naive Bayesian classifier, C4.5, a Bayesian tree learning algorithm, a constructive Bayesian classifier that eliminates attributes and constructs new attributes using Cartesian products of existing nominal attributes, and a lazy decision tree learning algorithm. It also outperforms, although the result is not statistically significant, a selective naive Bayesian classifier.  相似文献   

9.
以规则库为切入点,提出一个决策规则的批量增量更新算法。为所有新增对象建立一个等价类表,将原有规则库与等价类表进行高效匹配,根据新对象的不同匹配类型分别进行规则更新。该算法既适用于完备数据也适用于不完备数据,且只需访问2遍规则库就可以实现规则更新。理论分析和UCI数据上的比较实验结果都表明该方法优于传统方法。  相似文献   

10.
刘晓平 《计算机仿真》2005,22(12):76-79
用于知识发现的大部分数据挖掘工具均采用规则发现和决策树分类技术来发现数据模式和规则。该文通过采用基于仿真属性的离散化方法,基于概率统计的未知属性与噪声数据处理方法以及基于误差的剪枝算法,实现了用于自动生成决策树的通用算法模板。利用该模板,决策树算法的设计者可以快速验证为解决特定决策问题而设计的新算法。构造决策树的基本机制是算法的设计者利用其自己定义的公式来初始化通用算法模板。然后利用该系统提供的交互式图形环境,针对不同的决策问题测试该算法,从而找出适合特定问题的算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号