首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
A Novel Approach for Phase-Type Fitting with the EM Algorithm   总被引:2,自引:0,他引:2  
The representation of general distributions or measured data by phase-type distributions is an important and nontrivial task in analytical modeling. Although a large number of different methods for fitting parameters of phase-type distributions to data traces exist, many approaches lack efficiency and numerical stability. In this paper, a novel approach is presented that fits a restricted class of phase-type distributions, namely, mixtures of Erlang distributions, to trace data. For the parameter fitting, an algorithm of the expectation maximization type is developed. This paper shows that these choices result in a very efficient and numerically stable approach which yields phase-type approximations for a wide range of data traces that are as good or better than approximations computed with other less efficient and less stable fitting methods. To illustrate the effectiveness of the proposed fitting algorithm, we present comparative results for our approach and two other methods using six benchmark traces and two real traffic traces as well as quantitative results from queueing analysis.  相似文献   

2.
This paper proposes an improved computation method of maximum likelihood (ML) estimation for phase-type (PH) distributions with a number of phases. We focus on the EM (expectation-maximization) algorithm proposed by Asmussen et al. [27] and refine it in terms of time complexity. Two ideas behind our method are a uniformization-based procedure for computing a convolution integral of the matrix exponential and an improvement of the forward-backward algorithm using time intervals. Compared with the differential-equation-based EM algorithm discussed in Asmussen et al. [27], our approach succeeds in the reduction of computation time for the PH fitting with a moderate to large number of phases. In addition to the improvement of time complexity, this paper discusses how to estimate the canonical form by applying the EM algorithm. In numerical experiments, we examine computation times of the proposed and differential-equation-based EM algorithms. Furthermore, the proposed EM algorithm is also compared with the existing PH fitting methods in terms of computation time and fitting accuracy.  相似文献   

3.
基于有限混合多变量t分布的鲁棒聚类算法   总被引:1,自引:1,他引:0  
余成文  郭雷 《计算机科学》2007,34(5):190-193
在用混合模型聚类时,聚类数据中存在局外点是非常困难的问题。为了提高混合拟合的鲁棒性,本文用混合t模型替代混合高斯模型,来拟合含有背景噪音的多变量多高斯分布数据;提出了两个求解混合t模型的修改版期望最大化(EM)算法,并将它们与模型选择准则集成在一起,应用一个组合规则成分灭绝策略选择聚类成分数,得到两个对应的鲁棒聚类算法。对含有背景噪音的多个高斯成分进行不同聚类算法的大量实验表明,本文的鲁棒聚类算法能自动选择最佳的聚类成分数,相对于混合高斯模型的聚类方法,鲁棒性增强很多;相对于传统求解混合t模型(EM/ECM)的聚类方法,能有效避免其严重依赖初始值和易收敛至参数空间边界的缺点,具有较强的鲁棒性和较快的收敛速度。  相似文献   

4.
This paper considers fitting a mixture of Gaussians model to high-dimensional data in scenarios where there are fewer data samples than feature dimensions. Issues that arise when using principal component analysis (PCA) to represent Gaussian distributions inside Expectation-Maximization (EM) are addressed, and a practical algorithm results. Unlike other algorithms that have been proposed, this algorithm does not try to compress the data to fit low-dimensional models. Instead, it models Gaussian distributions in the (N - 1)-dimensional space spanned by the N data samples. We are able to show that this algorithm converges on data sets where low-dimensional techniques do not.  相似文献   

5.
PH分布数据拟合的数值加速EM算法   总被引:1,自引:0,他引:1       下载免费PDF全文
黄卓  潘晓  郭波 《计算机工程》2008,34(14):1-3
摘 要:针对Phase-type(PH)分布数据拟合EM算法收敛速度慢的问题,提出一种数值加速EM算法,通过增加每一步EM迭代的参数变化量达到加速的目的。用4个拟合实例与标准EM算法拟合进行对比,结果表明,该加速EM算法简单实用,保证了算法的收敛性,有效提高了PH分布数据拟合EM算法的收敛速度。  相似文献   

6.
有限混合密度模型及遥感影像EM聚类算法   总被引:3,自引:0,他引:3       下载免费PDF全文
遥感信息是地球表层信息的综合反映,由于地球表层系统的复杂性和开放性,地表信息是多维的、无限的、遥感信息传递过程中的局限性以及遥感信息之间的复杂相关性,决定了遥感信息其结果的不确定性和多解性,遥感信息具有一定的统计特性,同时又具有高度的随机性和复杂性,在特征空间中往往表现为混合密度分布,针对遥感信息这种统计分布的复杂性,提出了有限混合密度的期望最大(EM)分解模型,该模型假设总体分布可分解为有限个参数化的密度分布,通过EM迭代计算可估计出各密度分布的最大似然参数集;将有限混合EM聚类算法应用于遥感影像的聚类分析中,并与传统统计聚类方法进行了比较,比较结果表明,其对复杂地物的区分具有优势,另外在融合专家知识、初始化等方面具有扩展能力。  相似文献   

7.
黄卓  潘晓  郭波 《计算机工程》2008,34(4):75-78
ACPH分布继承了PH分布具有良好特性的特点,对其进行数据拟合的难度比PH大大降低。针对ACPH分布缺乏数值稳定拟合算法的问题,提出采用EM算法解决该问题,给出了ACPH分布数据拟合EM算法的理论推导,并通过3个拟合实例验证了算法的有效性。  相似文献   

8.
This paper demonstrates how the EM algorithm can be used for learning and matching mixtures of point distribution models. We make two contributions. First, we show how shape-classes can be learned in an unsupervised manner. We present a fast procedure for training point distribution models using the EM algorithm. Rather than estimating the class means and covariance matrices needed to construct the PDM, the method iteratively refines the eigenvectors of the covariance matrix using a gradient ascent technique. Second, we show how recognition by alignment can be realised by fitting a mixture of linear shape deformations. We evaluate the method on the problem of learning the class-structure and recognising Arabic characters.  相似文献   

9.
Implementing the Monte Carlo EM algorithm (MCEM) algorithm for finding maximum likelihood estimates (MLEs) in the nonlinear mixed effects model (NLMM) has encountered a great deal of difficulty in obtaining samples used for estimating the E step due to the intractability of the target distribution. Sampling methods such as Markov chain techniques and importance sampling have been used to alleviate such difficulty. The advantage of Markov chains is that they are applicable to a wider range of distributions than the approaches based on independent samples. However, in many cases the computational cost of Markov chains is significantly greater than that of independent samplers. The MCEM algorithms based on independent samples allow for straightforward assessment of Monte Carlo error and can be considerably more efficient than those based on Markov chains when an efficient candidate distribution is chosen, which forms the motivation of this paper. The proposed MCEM algorithm in this paper uses samples obtained from an easy-to-simulate and efficient importance distribution so that the computational intensity and complexity is much reduced. Moreover, the proposed MCEM algorithm preserves the flexibility introduced by independent samples in gauging Monte Carlo error and thus allows the Monte Carlo sample size to increase with the number of EM iterations. We also introduce an EM algorithm using Gaussian quadrature approximations (GQEM) for the E step. In low-dimensional cases, the GQEM algorithm is more efficient than the proposed MCEM algorithm and thus can be used as an alternative. The performances of the proposed EM methods are compared to the existing ML estimators using real data examples and simulations.  相似文献   

10.
Online learning with hidden markov models   总被引:1,自引:0,他引:1  
We present an online version of the expectation-maximization (EM) algorithm for hidden Markov models (HMMs). The sufficient statistics required for parameters estimation is computed recursively with time, that is, in an online way instead of using the batch forward-backward procedure. This computational scheme is generalized to the case where the model parameters can change with time by introducing a discount factor into the recurrence relations. The resulting algorithm is equivalent to the batch EM algorithm, for appropriate discount factor and scheduling of parameters update. On the other hand, the online algorithm is able to deal with dynamic environments, i.e., when the statistics of the observed data is changing with time. The implications of the online algorithm for probabilistic modeling in neuroscience are briefly discussed.  相似文献   

11.
The two-parameter linear failure rate distribution has been used quite successfully to analyze lifetime data. Recently, a new three-parameter distribution, known as the generalized linear failure rate distribution, has been introduced by exponentiating the linear failure rate distribution. The generalized linear failure rate distribution is a very flexible lifetime distribution, and the probability density function of the generalized linear failure rate distribution can take different shapes. Its hazard function also can be increasing, decreasing and bathtub shaped. The main aim of this paper is to introduce a bivariate generalized linear failure rate distribution, whose marginals are generalized linear failure rate distributions. It is obtained using the same approach as was adopted to obtain the Marshall-Olkin bivariate exponential distribution. Different properties of this new distribution are established. The bivariate generalized linear failure rate distribution has five parameters and the maximum likelihood estimators are obtained using the EM algorithm. A data set is analyzed for illustrative purposes. Finally, some generalizations to the multivariate case are proposed.  相似文献   

12.
An EM algorithm for the block mixture model   总被引:1,自引:0,他引:1  
Although many clustering procedures aim to construct an optimal partition of objects or, sometimes, of variables, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. Recently, we have proposed a new mixture model called block mixture model which takes into account this situation. This model allows one to embed simultaneous clustering of objects and variables in a mixture approach. We have studied this probabilistic model under the classification likelihood approach and developed a new algorithm for simultaneous partitioning based on the classification EM algorithm. In this paper, we consider the block clustering problem under the maximum likelihood approach and the goal of our contribution is to estimate the parameters of this model. Unfortunately, the application of the EM algorithm for the block mixture model cannot be made directly; difficulties arise due to the dependence structure in the model and approximations are required. Using a variational approximation, we propose a generalized EM algorithm to estimate the parameters of the block mixture model and, to illustrate our approach, we study the case of binary data by using a Bernoulli block mixture.  相似文献   

13.
In this paper, we present a novel spatially constrained generative model and an expectation-maximization (EM) algorithm for model-based image segmentation. The generative model assumes that the unobserved class labels of neighboring pixels in the image are generated by prior distributions with similar parameters, where similarity is defined by entropic quantities relating to the neighboring priors. In order to estimate model parameters from observations, we derive a spatially constrained EM algorithm that iteratively maximizes a lower bound on the data log-likelihood, where the penalty term is data-dependent. Our algorithm is very easy to implement and is similar to the standard EM algorithm for Gaussian mixtures with the main difference that the labels posteriors are "smoothed" over pixels between each E- and M-step by a standard image filter. Experiments on synthetic and real images show that our algorithm achieves competitive segmentation results compared to other Markov-based methods, and is in general faster  相似文献   

14.
Cussens  James 《Machine Learning》2001,44(3):245-271
Stochastic logic programs (SLPs) are logic programs with parameterised clauses which define a log-linear distribution over refutations of goals. The log-linear distribution provides, by marginalisation, a distribution over variable bindings, allowing SLPs to compactly represent quite complex distributions.We analyse the fundamental statistical properties of SLPs addressing issues concerning infinite derivations, 'unnormalised SLPs and impure SLPs. After detailing existing approaches to parameter estimation for log-linear models and their application to SLPs, we present a new algorithm called failure-adjusted maximisation (FAM). FAM is an instance of the EM algorithm that applies specifically to normalised SLPs and provides a closed-form for computing parameter updates within an iterative maximisation approach. We empirically show that FAM works on some small examples and discuss methods for applying it to bigger problems.  相似文献   

15.
In this paper we consider the Marshall-Olkin bivariate Weibull distribution. The Marshall-Olkin bivariate Weibull distribution is a singular distribution, whose both the marginals are univariate Weibull distributions. This is a generalization of the Marshall-Olkin bivariate exponential distribution. The cumulative joint distribution of the Marshall-Olkin bivariate Weibull distribution is a mixture of an absolute continuous distribution function and a singular distribution function. This distribution has four unknown parameters and it is observed that the maximum likelihood estimators of the unknown parameters cannot be obtained in explicit forms. In this paper we discuss about the computation of the maximum likelihood estimators of the unknown parameters using EM algorithm. We perform some simulations to see the performances of the EM algorithm and re-analyze one data set for illustrative purpose.  相似文献   

16.
We present an algorithm to estimate the parameters of a linear model in the presence of heteroscedastic noise, i.e., each data point having a different covariance matrix. The algorithm is motivated by the recovery of bilinear forms, one of the fundamental problems in computer vision which appears whenever the epipolar constraint is imposed, or a conic is fit to noisy data points. We employ the errors-in-variables (EIV) model and show why already at moderate noise levels most available methods fail to provide a satisfactory solution. The improved behavior of the new algorithm is due to two factors: taking into account the heteroscedastic nature of the errors arising from the linearization of the bilinear form, and the use of generalized singular value decomposition (GSVD) in the computations. The performance of the algorithm is compared with several methods proposed in the literature for ellipse fitting and estimation of the fundamental matrix. It is shown that the algorithm achieves the accuracy of nonlinear optimization techniques at much less computational cost.  相似文献   

17.
Generalized Dirichlet distributions have a more flexible covariance structure than Dirichlet distributions, and the computation for the moments of a generalized Dirichlet distribution is still tractable. For situations under which Dirichlet distributions are inappropriate for data analysis, generalized Dirichlet distributions will generally be an applicable alternative. When the expected values and the covariance matrix of random variables can be estimated from available data, this study introduces ways to estimate the parameters of a generalized Dirichlet distribution for analyzing compositional data. Under the assumption that the sample mean of every variable must be considered for parameter estimation, we present methods for choosing the statistics from a sample covariance matrix to construct a generalized Dirichlet distribution. Some rules for removing inappropriate statistics from a sample covariance matrix to speed up the estimation process are also established. An example for Taiwan’s car market is introduced to demonstrate the applicability of the parameter estimation methods.  相似文献   

18.
In this paper, we consider a family of generalized Birnbaum-Saunders distributions and present a lifetime analysis based mainly on the hazard function of this model. In addition, we carry out maximum likelihood estimation by using an iterative algorithm, which produces robust estimates. Asymptotic inference is also presented. Next, the quality of the estimation method is examined by means of Monte Carlo simulations. We then provide a practical example to illustrate the obtained results. From this example and based on goodness-of-fit methods, we show that the GBS distribution results in a more appropriate model for modeling fatigue data than other models commonly used to model this type of data. Finally, we estimate the hazard function and the critical point of this function.  相似文献   

19.
ABSTRACT

Learning parameters of a probabilistic model is a necessary step in machine learning tasks. We present a method to improve learning from small datasets by using monotonicity conditions. Monotonicity simplifies the learning and it is often required by users. We present an algorithm for Bayesian Networks parameter learning. The algorithm and monotonicity conditions are described, and it is shown that with the monotonicity conditions we can better fit underlying data. Our algorithm is tested on artificial and empiric datasets. We use different methods satisfying monotonicity conditions: the proposed gradient descent, isotonic regression EM, and non-linear optimization. We also provide results of unrestricted EM and gradient descent methods. Learned models are compared with respect to their ability to fit data in terms of log-likelihood and their fit of parameters of the generating model. Our proposed method outperforms other methods for small sets, and provides better or comparable results for larger sets.  相似文献   

20.
对于关键复杂设备进行健康诊断和设备剩余寿命预测,提出了一种基于爱尔朗分布和隐半马尔可夫模型的联合剩余寿命预测模型(Erlang-HSMM,E-HSMM)。首先,提出了改进的前后向算法、维特比算法和BaumWelch算法,有效地降低了模型的计算复杂度;其次,基于爱尔朗分布改进设备的健康状态逗留时间,将状态逗留时间分为已遍历和未遍历两个部分,提出新的健康状态逗留时间的概率分布;最后,针对状态监测数据,利用失效率理论构建设备剩余寿命预测模型。通过美国Caterpillar公司液压泵的状态监测实际数据进行评价与验证,实验结果表明,E-HSMM模型对设备的状态诊断和剩余寿命预测更加符合实际状况,比传统的隐半马尔可夫模型(HSMM)更有效。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号