共查询到20条相似文献,搜索用时 453 毫秒
1.
A Novel Approach for Phase-Type Fitting with the EM Algorithm 总被引:2,自引:0,他引:2
Thummler A. Buchholz P. Telek M. 《Dependable and Secure Computing, IEEE Transactions on》2006,3(3):245-258
The representation of general distributions or measured data by phase-type distributions is an important and nontrivial task in analytical modeling. Although a large number of different methods for fitting parameters of phase-type distributions to data traces exist, many approaches lack efficiency and numerical stability. In this paper, a novel approach is presented that fits a restricted class of phase-type distributions, namely, mixtures of Erlang distributions, to trace data. For the parameter fitting, an algorithm of the expectation maximization type is developed. This paper shows that these choices result in a very efficient and numerically stable approach which yields phase-type approximations for a wide range of data traces that are as good or better than approximations computed with other less efficient and less stable fitting methods. To illustrate the effectiveness of the proposed fitting algorithm, we present comparative results for our approach and two other methods using six benchmark traces and two real traffic traces as well as quantitative results from queueing analysis. 相似文献
2.
Hiroyuki OkamuraAuthor Vitae Tadashi DohiAuthor VitaeKishor S. TrivediAuthor Vitae 《Performance Evaluation》2011,68(10):938-954
This paper proposes an improved computation method of maximum likelihood (ML) estimation for phase-type (PH) distributions with a number of phases. We focus on the EM (expectation-maximization) algorithm proposed by Asmussen et al. [27] and refine it in terms of time complexity. Two ideas behind our method are a uniformization-based procedure for computing a convolution integral of the matrix exponential and an improvement of the forward-backward algorithm using time intervals. Compared with the differential-equation-based EM algorithm discussed in Asmussen et al. [27], our approach succeeds in the reduction of computation time for the PH fitting with a moderate to large number of phases. In addition to the improvement of time complexity, this paper discusses how to estimate the canonical form by applying the EM algorithm. In numerical experiments, we examine computation times of the proposed and differential-equation-based EM algorithms. Furthermore, the proposed EM algorithm is also compared with the existing PH fitting methods in terms of computation time and fitting accuracy. 相似文献
3.
基于有限混合多变量t分布的鲁棒聚类算法 总被引:1,自引:1,他引:0
在用混合模型聚类时,聚类数据中存在局外点是非常困难的问题。为了提高混合拟合的鲁棒性,本文用混合t模型替代混合高斯模型,来拟合含有背景噪音的多变量多高斯分布数据;提出了两个求解混合t模型的修改版期望最大化(EM)算法,并将它们与模型选择准则集成在一起,应用一个组合规则成分灭绝策略选择聚类成分数,得到两个对应的鲁棒聚类算法。对含有背景噪音的多个高斯成分进行不同聚类算法的大量实验表明,本文的鲁棒聚类算法能自动选择最佳的聚类成分数,相对于混合高斯模型的聚类方法,鲁棒性增强很多;相对于传统求解混合t模型(EM/ECM)的聚类方法,能有效避免其严重依赖初始值和易收敛至参数空间边界的缺点,具有较强的鲁棒性和较快的收敛速度。 相似文献
4.
Bruce A Draper Daniel L Elliott Jeremy Hayes Kyungim Baek 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2005,35(3):571-577
This paper considers fitting a mixture of Gaussians model to high-dimensional data in scenarios where there are fewer data samples than feature dimensions. Issues that arise when using principal component analysis (PCA) to represent Gaussian distributions inside Expectation-Maximization (EM) are addressed, and a practical algorithm results. Unlike other algorithms that have been proposed, this algorithm does not try to compress the data to fit low-dimensional models. Instead, it models Gaussian distributions in the (N - 1)-dimensional space spanned by the N data samples. We are able to show that this algorithm converges on data sets where low-dimensional techniques do not. 相似文献
5.
6.
遥感信息是地球表层信息的综合反映,由于地球表层系统的复杂性和开放性,地表信息是多维的、无限的、遥感信息传递过程中的局限性以及遥感信息之间的复杂相关性,决定了遥感信息其结果的不确定性和多解性,遥感信息具有一定的统计特性,同时又具有高度的随机性和复杂性,在特征空间中往往表现为混合密度分布,针对遥感信息这种统计分布的复杂性,提出了有限混合密度的期望最大(EM)分解模型,该模型假设总体分布可分解为有限个参数化的密度分布,通过EM迭代计算可估计出各密度分布的最大似然参数集;将有限混合EM聚类算法应用于遥感影像的聚类分析中,并与传统统计聚类方法进行了比较,比较结果表明,其对复杂地物的区分具有优势,另外在融合专家知识、初始化等方面具有扩展能力。 相似文献
7.
8.
This paper demonstrates how the EM algorithm can be used for learning and matching mixtures of point distribution models. We make two contributions. First, we show how shape-classes can be learned in an unsupervised manner. We present a fast procedure for training point distribution models using the EM algorithm. Rather than estimating the class means and covariance matrices needed to construct the PDM, the method iteratively refines the eigenvectors of the covariance matrix using a gradient ascent technique. Second, we show how recognition by alignment can be realised by fitting a mixture of linear shape deformations. We evaluate the method on the problem of learning the class-structure and recognising Arabic characters. 相似文献
9.
Jing Wang 《Computational statistics & data analysis》2007,51(6):3244-3256
Implementing the Monte Carlo EM algorithm (MCEM) algorithm for finding maximum likelihood estimates (MLEs) in the nonlinear mixed effects model (NLMM) has encountered a great deal of difficulty in obtaining samples used for estimating the E step due to the intractability of the target distribution. Sampling methods such as Markov chain techniques and importance sampling have been used to alleviate such difficulty. The advantage of Markov chains is that they are applicable to a wider range of distributions than the approaches based on independent samples. However, in many cases the computational cost of Markov chains is significantly greater than that of independent samplers. The MCEM algorithms based on independent samples allow for straightforward assessment of Monte Carlo error and can be considerably more efficient than those based on Markov chains when an efficient candidate distribution is chosen, which forms the motivation of this paper. The proposed MCEM algorithm in this paper uses samples obtained from an easy-to-simulate and efficient importance distribution so that the computational intensity and complexity is much reduced. Moreover, the proposed MCEM algorithm preserves the flexibility introduced by independent samples in gauging Monte Carlo error and thus allows the Monte Carlo sample size to increase with the number of EM iterations. We also introduce an EM algorithm using Gaussian quadrature approximations (GQEM) for the E step. In low-dimensional cases, the GQEM algorithm is more efficient than the proposed MCEM algorithm and thus can be used as an alternative. The performances of the proposed EM methods are compared to the existing ML estimators using real data examples and simulations. 相似文献
10.
Online learning with hidden markov models 总被引:1,自引:0,他引:1
We present an online version of the expectation-maximization (EM) algorithm for hidden Markov models (HMMs). The sufficient statistics required for parameters estimation is computed recursively with time, that is, in an online way instead of using the batch forward-backward procedure. This computational scheme is generalized to the case where the model parameters can change with time by introducing a discount factor into the recurrence relations. The resulting algorithm is equivalent to the batch EM algorithm, for appropriate discount factor and scheduling of parameters update. On the other hand, the online algorithm is able to deal with dynamic environments, i.e., when the statistics of the observed data is changing with time. The implications of the online algorithm for probabilistic modeling in neuroscience are briefly discussed. 相似文献
11.
Ammar M. Sarhan David C. HamiltonBruce Smith Debasis Kundu 《Computational statistics & data analysis》2011,55(1):644-654
The two-parameter linear failure rate distribution has been used quite successfully to analyze lifetime data. Recently, a new three-parameter distribution, known as the generalized linear failure rate distribution, has been introduced by exponentiating the linear failure rate distribution. The generalized linear failure rate distribution is a very flexible lifetime distribution, and the probability density function of the generalized linear failure rate distribution can take different shapes. Its hazard function also can be increasing, decreasing and bathtub shaped. The main aim of this paper is to introduce a bivariate generalized linear failure rate distribution, whose marginals are generalized linear failure rate distributions. It is obtained using the same approach as was adopted to obtain the Marshall-Olkin bivariate exponential distribution. Different properties of this new distribution are established. The bivariate generalized linear failure rate distribution has five parameters and the maximum likelihood estimators are obtained using the EM algorithm. A data set is analyzed for illustrative purposes. Finally, some generalizations to the multivariate case are proposed. 相似文献
12.
An EM algorithm for the block mixture model 总被引:1,自引:0,他引:1
Govaert G Nadif M 《IEEE transactions on pattern analysis and machine intelligence》2005,27(4):643-647
Although many clustering procedures aim to construct an optimal partition of objects or, sometimes, of variables, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. Recently, we have proposed a new mixture model called block mixture model which takes into account this situation. This model allows one to embed simultaneous clustering of objects and variables in a mixture approach. We have studied this probabilistic model under the classification likelihood approach and developed a new algorithm for simultaneous partitioning based on the classification EM algorithm. In this paper, we consider the block clustering problem under the maximum likelihood approach and the goal of our contribution is to estimate the parameters of this model. Unfortunately, the application of the EM algorithm for the block mixture model cannot be made directly; difficulties arise due to the dependence structure in the model and approximations are required. Using a variational approximation, we propose a generalized EM algorithm to estimate the parameters of the block mixture model and, to illustrate our approach, we study the case of binary data by using a Bernoulli block mixture. 相似文献
13.
In this paper, we present a novel spatially constrained generative model and an expectation-maximization (EM) algorithm for model-based image segmentation. The generative model assumes that the unobserved class labels of neighboring pixels in the image are generated by prior distributions with similar parameters, where similarity is defined by entropic quantities relating to the neighboring priors. In order to estimate model parameters from observations, we derive a spatially constrained EM algorithm that iteratively maximizes a lower bound on the data log-likelihood, where the penalty term is data-dependent. Our algorithm is very easy to implement and is similar to the standard EM algorithm for Gaussian mixtures with the main difference that the labels posteriors are "smoothed" over pixels between each E- and M-step by a standard image filter. Experiments on synthetic and real images show that our algorithm achieves competitive segmentation results compared to other Markov-based methods, and is in general faster 相似文献
14.
Stochastic logic programs (SLPs) are logic programs with parameterised clauses which define a log-linear distribution over refutations of goals. The log-linear distribution provides, by marginalisation, a distribution over variable bindings, allowing SLPs to compactly represent quite complex distributions.We analyse the fundamental statistical properties of SLPs addressing issues concerning infinite derivations, 'unnormalised SLPs and impure SLPs. After detailing existing approaches to parameter estimation for log-linear models and their application to SLPs, we present a new algorithm called failure-adjusted maximisation (FAM). FAM is an instance of the EM algorithm that applies specifically to normalised SLPs and provides a closed-form for computing parameter updates within an iterative maximisation approach. We empirically show that FAM works on some small examples and discuss methods for applying it to bigger problems. 相似文献
15.
Estimating the parameters of the Marshall-Olkin bivariate Weibull distribution by EM algorithm 总被引:1,自引:0,他引:1
In this paper we consider the Marshall-Olkin bivariate Weibull distribution. The Marshall-Olkin bivariate Weibull distribution is a singular distribution, whose both the marginals are univariate Weibull distributions. This is a generalization of the Marshall-Olkin bivariate exponential distribution. The cumulative joint distribution of the Marshall-Olkin bivariate Weibull distribution is a mixture of an absolute continuous distribution function and a singular distribution function. This distribution has four unknown parameters and it is observed that the maximum likelihood estimators of the unknown parameters cannot be obtained in explicit forms. In this paper we discuss about the computation of the maximum likelihood estimators of the unknown parameters using EM algorithm. We perform some simulations to see the performances of the EM algorithm and re-analyze one data set for illustrative purpose. 相似文献
16.
We present an algorithm to estimate the parameters of a linear model in the presence of heteroscedastic noise, i.e., each data point having a different covariance matrix. The algorithm is motivated by the recovery of bilinear forms, one of the fundamental problems in computer vision which appears whenever the epipolar constraint is imposed, or a conic is fit to noisy data points. We employ the errors-in-variables (EIV) model and show why already at moderate noise levels most available methods fail to provide a satisfactory solution. The improved behavior of the new algorithm is due to two factors: taking into account the heteroscedastic nature of the errors arising from the linearization of the bilinear form, and the use of generalized singular value decomposition (GSVD) in the computations. The performance of the algorithm is compared with several methods proposed in the literature for ellipse fitting and estimation of the fundamental matrix. It is shown that the algorithm achieves the accuracy of nonlinear optimization techniques at much less computational cost. 相似文献
17.
Tzu-Tsung Wong 《Computational statistics & data analysis》2010,54(7):1756-1765
Generalized Dirichlet distributions have a more flexible covariance structure than Dirichlet distributions, and the computation for the moments of a generalized Dirichlet distribution is still tractable. For situations under which Dirichlet distributions are inappropriate for data analysis, generalized Dirichlet distributions will generally be an applicable alternative. When the expected values and the covariance matrix of random variables can be estimated from available data, this study introduces ways to estimate the parameters of a generalized Dirichlet distribution for analyzing compositional data. Under the assumption that the sample mean of every variable must be considered for parameter estimation, we present methods for choosing the statistics from a sample covariance matrix to construct a generalized Dirichlet distribution. Some rules for removing inappropriate statistics from a sample covariance matrix to speed up the estimation process are also established. An example for Taiwan’s car market is introduced to demonstrate the applicability of the parameter estimation methods. 相似文献
18.
Víctor Leiva Marco Riquelme Antonio Sanhueza 《Computational statistics & data analysis》2008,52(4):2079-2097
In this paper, we consider a family of generalized Birnbaum-Saunders distributions and present a lifetime analysis based mainly on the hazard function of this model. In addition, we carry out maximum likelihood estimation by using an iterative algorithm, which produces robust estimates. Asymptotic inference is also presented. Next, the quality of the estimation method is examined by means of Monte Carlo simulations. We then provide a practical example to illustrate the obtained results. From this example and based on goodness-of-fit methods, we show that the GBS distribution results in a more appropriate model for modeling fatigue data than other models commonly used to model this type of data. Finally, we estimate the hazard function and the critical point of this function. 相似文献
19.
ABSTRACTLearning parameters of a probabilistic model is a necessary step in machine learning tasks. We present a method to improve learning from small datasets by using monotonicity conditions. Monotonicity simplifies the learning and it is often required by users. We present an algorithm for Bayesian Networks parameter learning. The algorithm and monotonicity conditions are described, and it is shown that with the monotonicity conditions we can better fit underlying data. Our algorithm is tested on artificial and empiric datasets. We use different methods satisfying monotonicity conditions: the proposed gradient descent, isotonic regression EM, and non-linear optimization. We also provide results of unrestricted EM and gradient descent methods. Learned models are compared with respect to their ability to fit data in terms of log-likelihood and their fit of parameters of the generating model. Our proposed method outperforms other methods for small sets, and provides better or comparable results for larger sets. 相似文献
20.
对于关键复杂设备进行健康诊断和设备剩余寿命预测,提出了一种基于爱尔朗分布和隐半马尔可夫模型的联合剩余寿命预测模型(Erlang-HSMM,E-HSMM)。首先,提出了改进的前后向算法、维特比算法和BaumWelch算法,有效地降低了模型的计算复杂度;其次,基于爱尔朗分布改进设备的健康状态逗留时间,将状态逗留时间分为已遍历和未遍历两个部分,提出新的健康状态逗留时间的概率分布;最后,针对状态监测数据,利用失效率理论构建设备剩余寿命预测模型。通过美国Caterpillar公司液压泵的状态监测实际数据进行评价与验证,实验结果表明,E-HSMM模型对设备的状态诊断和剩余寿命预测更加符合实际状况,比传统的隐半马尔可夫模型(HSMM)更有效。 相似文献