首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
2.
3.
While many constructive induction algorithms focus on generating new binary attributes, this paper explores novel methods of constructing nominal and numeric attributes. We propose a new constructive operator, X-of-N. An X-of-N representation is a set containing one or more attribute-value pairs. For a given instance, the value of an X-of-N representation corresponds to the number of its attribute-value pairs that are true of the instance. A single X-of-N representation can directly and simply represent any concept that can be represented by a single conjunctive, a single disjunctive, or a single M-of-N representation commonly used for constructive induction, and the reverse is not true. In this paper, we describe a constructive decision tree learning algorithm, called XofN. When building decision trees, this algorithm creates one X-of-N representation, either as a nominal attribute or as a numeric attribute, at each decision node. The construction of X-of-N representations is carried out by greedily searching the space defined by all the attribute-value pairs of a domain. Experimental results reveal that constructing X-of-N attributes can significantly improve the performance of decision tree learning in both artificial and natural domains in terms of higher prediction accuracy and lower theory complexity. The results also show the performance advantages of constructing X-of-N attributes over constructing conjunctive, disjunctive, or M-of-N representations for decision tree learning.  相似文献   

4.
基于混合概率模型的无监督离散化算法   总被引:10,自引:0,他引:10  
李刚 《计算机学报》2002,25(2):158-164
现实应用中常常涉及许多连续的数值属性,而且前许多机器学习算法则要求所处理的属性取离散值,根据在对数值属性的离散化过程中,是否考虑相关类别属性的值,离散化算法可分为有监督算法和无监督算法两类。基于混合概率模型,该文提出了一种理论严格的无监督离散化算法,它能够在无先验知识,无类别是属性的前提下,将数值属性的值域划分为若干子区间,再通过贝叶斯信息准则自动地寻求最佳的子区间数目和区间划分方法。  相似文献   

5.
A framework for automatic landmark identification is presented based on an algorithm for corresponding the boundaries of two shapes. The auto-landmarking framework employs a binary tree of corresponded pairs of shapes to generate landmarks automatically on each of a set of example shapes. The landmarks are used to train statistical shape models, known as point distribution models. The correspondence algorithm locates a matching pair of sparse polygonal approximations, one for each of a pair of boundaries by minimizing a cost function, using a greedy algorithm. The cost function expresses the dissimilarity in both the shape and representation error (with respect to the defining boundary) of the sparse polygons. Results are presented for three classes of shape which exhibit various types of nonrigid deformation  相似文献   

6.
7.
We examine the class of multi-linear representations (MLR) for expressing probability distributions over discrete variables. Recently, MLR have been considered as intermediate representations that facilitate inference in distributions represented as graphical models. We show that MLR is an expressive representation of discrete distributions and can be used to concisely represent classes of distributions which have exponential size in other commonly used representations, while supporting probabilistic inference in time linear in the size of the representation. Our key contribution is presenting techniques for learning bounded-size distributions represented using MLR, which support efficient probabilistic inference. We demonstrate experimentally that the MLR representations we learn support accurate and very efficient inference.  相似文献   

8.
《Pattern recognition》2004,37(1):47-59
A new general image segmentation system is presented, based on the calculation of a tree representation of the original image in which image regions are assigned to tree nodes, followed by a correspondence process with a model tree, which embeds the a priori knowledge about the images. For this correspondence, an original algorithm is proposed, which performs the minimization of an error function that quantifies the difference between the input image tree and the model tree. We also present a new algorithm for automatically calculating the model tree from a set of manually segmented images. Results on synthetic and MR brain images are presented.  相似文献   

9.
We present in this article the model function-described graph (FDG), which is a type of compact representation of a set of attributed graphs (AGs) that borrow from random graphs the capability of probabilistic modelling of structural and attribute information. We define the FDGs, their features and two distance measures between AGs (unclassified patterns) and FDGs (models or classes) and we also explain an efficient matching algorithm. Two applications of FDGs are presented: in the former, FDGs are used for modelling and matching 3D-objects described by multiple views, whereas in the latter, they are used for representing and recognising human faces, described also by several views.  相似文献   

10.
Boolean Feature Discovery in Empirical Learning   总被引:19,自引:7,他引:12  
  相似文献   

11.
引入半边概念描述Internet资源的特征属性,为网络环境下各类资源特征属性建立一个统一描述框架;扩展一般情况下的资源关系表示图,提出了资源属性关系的时变半边图模型;以网络的无尺度特性作为资源关联关系的演化规律的理论依据,给出时变半边图的一个具体生成算法。时变半边图能更方便地反映资源属性之间的动态拓扑关系,具有很好的可扩充性,可望再现真实网络的无尺度特性。  相似文献   

12.
Different ways of representing probabilistic relationships among the attributes of a domain ar examined, and it is shown that the nature of domain relationships used in a representation affects the types of reasoning objectives that can be achieved. Two well-known formalisms for representing the probabilistic among attributes of a domain. These are the dependence tree formalism presented by C.K. Chow and C.N. Liu (1968) and the Bayesian networks methodology presented by J. Pearl (1986). An example is used to illustrate the nature of the relationships and the difference in the types of reasoning performed by these two representations. An abductive type of reasoning objective that requires use of the known qualitative relationships of the domain is demonstrated. A suitable way to represent such qualitative relationships along with the probabilistic knowledge is given, and how an explanation for a set of observed events may be constituted is discussed. An algorithm for learning the qualitative relationships from empirical data using an algorithm based on the minimization of conditional entropy is presented  相似文献   

13.
14.
Factorial Hidden Markov Models   总被引:15,自引:0,他引:15  
Hidden Markov models (HMMs) have proven to be one of the most widely used tools for learning probabilistic models of time series data. In an HMM, information about the past is conveyed through a single discrete variable—the hidden state. We discuss a generalization of HMMs in which this state is factored into multiple state variables and is therefore represented in a distributed manner. We describe an exact algorithm for inferring the posterior probabilities of the hidden state variables given the observations, and relate it to the forward–backward algorithm for HMMs and to algorithms for more general graphical models. Due to the combinatorial nature of the hidden state representation, this exact algorithm is intractable. As in other intractable systems, approximate inference can be carried out using Gibbs sampling or variational methods. Within the variational framework, we present a structured approximation in which the the state variables are decoupled, yielding a tractable algorithm for learning the parameters of the model. Empirical comparisons suggest that these approximations are efficient and provide accurate alternatives to the exact methods. Finally, we use the structured approximation to model Bach's chorales and show that factorial HMMs can capture statistical structure in this data set which an unconstrained HMM cannot.  相似文献   

15.
曹存根  眭跃飞  孙瑜  曾庆田 《软件学报》2006,17(8):1731-1742
数学知识表示是知识表示中的一个重要方面,是数学知识检索、自动定理机器证明、智能教学系统等的基础.根据在设计NKI(national knowledge infrastructure)的数学知识表示语言中遇到的问题,并在讨论了数学对象的本体论假设的基础上提出了两种数学知识的表示方法:一种是以一个逻辑语言上的公式为属性值域的描述逻辑;另一种是以描述逻辑描述的本体为逻辑语言的一部分的一阶逻辑.在前者的表示中,如果对公式不作任何限制,那么得到的知识库中的推理不是可算法化的;在后者的表示中,以描述逻辑描述的本体中的推理是可算法化的,而以本体为逻辑语言的一部分的一阶逻辑所表示的数学知识中的推理一般是不可算法化的.因此,在表示数学知识时,需要区分概念性的知识(本体中的知识)和非概念性的知识(用本体作为语言表示的知识).框架或者描述逻辑可以表示和有效地推理概念性知识,但如果将非概念性知识加入到框架或知识库中,就可能使得原来可以有效推理的框架所表示的知识库不存在有效的推理算法,甚至不存在推理算法.为此,建议在表示数学知识时,用框架或描述逻辑来表示概念性知识;然后,用这样表示的知识库作为逻辑语言的一部分,以表示非概念性知识.  相似文献   

16.
In this paper a method is proposed to recognize symbols in electrical diagrams based on probabilistic matching. The skeletons of the symbols are represented by graphs. After finding the pose of the graph (orientation, translation, scale) by a bounded search for a minimum error transformation, the observed graph is matched to the class models and the likelihood of the match is calculated. Results are given for computer-generated symbols and hand drawn symbols with and without a template. Error rates range from <1% to 8%.  相似文献   

17.
A Continuous Probabilistic Framework for Image Matching   总被引:1,自引:0,他引:1  
In this paper we describe a probabilistic image matching scheme in which the image representation is continuous and the similarity measure and distance computation are also defined in the continuous domain. Each image is first represented as a Gaussian mixture distribution and images are compared and matched via a probabilistic measure of similarity between distributions. A common probabilistic and continuous framework is applied to the representation as well as the matching process, ensuring an overall system that is theoretically appealing. Matching results are investigated and the application to an image retrieval system is demonstrated.  相似文献   

18.
The phasic firing of dopamine neurons has been theorized to encode a reward-prediction error as formalized by the temporal-difference (TD) algorithm in reinforcement learning. Most TD models of dopamine have assumed a stimulus representation, known as the complete serial compound, in which each moment in a trial is distinctly represented. We introduce a more realistic temporal stimulus representation for the TD model. In our model, all external stimuli, including rewards, spawn a series of internal microstimuli, which grow weaker and more diffuse over time. These microstimuli are used by the TD learning algorithm to generate predictions of future reward. This new stimulus representation injects temporal generalization into the TD model and enhances correspondence between model and data in several experiments, including those when rewards are omitted or received early. This improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations.  相似文献   

19.
This paper shows how the nondirectional structural analysis of pattern data can be performed by matching a problem reduction representation (PRR) of pattern structure with sample data, using a best-first state space search algorithm called SSS*. The end result of the matching algorithm is a tree whose nodes represent recognized structures in the data. Tip nodes of the tree structure correspond to primitives which are recognized in the raw data by curve fitting routines. The operators of the algorithm allow the tree to be constructed with a combination of top-down or bottom-up steps. The matching of the structure tree to waveform segments need not be done in a left-right sequence. Moreover ambiguous matches are pursued in a best first order by using state space search with partial parse trees as states. A software system called WAPSYS (for waveform parsing system) is described, which implements this structural analysis paradigm. Experience using WAPSYS to analyze carotid pulse waves is also discussed.  相似文献   

20.
本文给出了一种识别通用CAD软件绘制的电气原理图的方法。该方法先根据电气符号的几何特征将其从图纸中提取出来并使用含两级属性的属性图表示,然后采用过滤图模式库和Ullman算法相结合的方法识别电气符号,最后提取图纸中各种文字标注和电气符号的连接关系。实验证明,该方法能够准确地识别电气原理图。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号