首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 5 毫秒
1.
In this paper we build on an approach proposed by Zou et al. (2014) for nonparametric changepoint detection. This approach defines the best segmentation for a data set as the one which minimises a penalised cost function, with the cost function defined in term of minus a non-parametric log-likelihood for data within each segment. Minimising this cost function is possible using dynamic programming, but their algorithm had a computational cost that is cubic in the length of the data set. To speed up computation, Zou et al. (2014) resorted to a screening procedure which means that the estimated segmentation is no longer guaranteed to be the global minimum of the cost function. We show that the screening procedure adversely affects the accuracy of the changepoint detection method, and show how a faster dynamic programming algorithm, pruned exact linear time (PELT) (Killick et al. 2012), can be used to find the optimal segmentation with a computational cost that can be close to linear in the amount of data. PELT requires a penalty to avoid under/over-fitting the model which can have a detrimental effect on the quality of the detected changepoints. To overcome this issue we use a relatively new method, changepoints over a range of penalties (Haynes et al. 2016), which finds all of the optimal segmentations for multiple penalty values over a continuous range. We apply our method to detect changes in heart-rate during physical activity.  相似文献   

2.
Statistics and Computing - High-dimensional changepoint analysis is a growing area of research and has applications in a wide range of fields. The aim is to accurately and efficiently detect...  相似文献   

3.
We study Bayesian dynamic models for detecting changepoints in count time series that present structural breaks. As the inferential approach, we develop a parameter learning version of the algorithm proposed by Chopin [Chopin N. Dynamic detection of changepoints in long time series. Annals of the Institute of Statistical Mathematics 2007;59:349–366.], called the Chopin filter with parameter learning, which allows us to estimate the static parameters in the model. In this extension, the static parameters are addressed by using the kernel smoothing approximations proposed by Liu and West [Liu J, West M. Combined parameters and state estimation in simulation-based filtering. In: Doucet A, de Freitas N, Gordon N, editors. Sequential Monte Carlo methods in practice. New York: Springer-Verlag; 2001]. The proposed methodology is then applied to both simulated and real data sets and the time series models include distributions that allow for overdispersion and/or zero inflation. Since our procedure is general, robust and naturally adaptive because the particle filter approach does not require restrictive specifications to ensure its validity and effectiveness, we believe it is a valuable alternative for dealing with the problem of detecting changepoints in count time series. The proposed methodology is also suitable for count time series with no changepoints and for independent count data.  相似文献   

4.
5.
The choice of the model framework in a regression setting depends on the nature of the data. The focus of this study is on changepoint data, exhibiting three phases: incoming and outgoing, both of which are linear, joined by a curved transition. Bent-cable regression is an appealing statistical tool to characterize such trajectories, quantifying the nature of the transition between the two linear phases by modeling the transition as a quadratic phase with unknown width. We demonstrate that a quadratic function may not be appropriate to adequately describe many changepoint data. We then propose a generalization of the bent-cable model by relaxing the assumption of the quadratic bend. The properties of the generalized model are discussed and a Bayesian approach for inference is proposed. The generalized model is demonstrated with applications to three data sets taken from environmental science and economics. We also consider a comparison among the quadratic bent-cable, generalized bent-cable and piecewise linear models in terms of goodness of fit in analyzing both real-world and simulated data. This study suggests that the proposed generalization of the bent-cable model can be valuable in adequately describing changepoint data that exhibit either an abrupt or gradual transition over time.  相似文献   

6.
A general way of detecting multivariate outliers involves using robust depth functions, or, equivalently, the corresponding ‘outlyingness’ functions; the more outlying an observation, the more extreme (less deep) it is in the data cloud and thus potentially an outlier. Most outlier detection studies in the literature assume that the underlying distribution is multivariate normal. This paper deals with the case of multivariate skewed data, specifically when the data follow the multivariate skew-normal [1] distribution. We compare the outlier detection capabilities of four robust outlier detection methods through their outlyingness functions in a simulation study. Two scenarios are considered for the occurrence of outliers: ‘the cluster’ and ‘the radial’. Conclusions and recommendations are offered for each scenario.  相似文献   

7.
Generalized Gibbs samplers simulate from any direction, not necessarily limited to the coordinate directions of the parameters of the objective function. We study how to optimally choose such directions in a random scan Gibbs sampler setting. We consider that optimal directions will be those that minimize the Kullback–Leibler divergence of two Markov chain Monte Carlo steps. Two distributions over direction are proposed for the multivariate Normal objective function. The resulting algorithms are used to simulate from a truncated multivariate Normal distribution, and the performance of our algorithms is compared with the performance of two algorithms based on the Gibbs sampler.  相似文献   

8.
Time series within fields such as finance and economics are often modelled using long memory processes. Alternative studies on the same data can suggest that series may actually contain a ‘changepoint’ (a point within the time series where the data generating process has changed). These models have been shown to have elements of similarity, such as within their spectrum. Without prior knowledge this leads to an ambiguity between these two models, meaning it is difficult to assess which model is most appropriate. We demonstrate that considering this problem in a time-varying environment using the time-varying spectrum removes this ambiguity. Using the wavelet spectrum, we then use a classification approach to determine the most appropriate model (long memory or changepoint). Simulation results are presented across a number of models followed by an application to stock cross-correlations and US inflation. The results indicate that the proposed classification outperforms an existing hypothesis testing approach on a number of models and performs comparatively across others.  相似文献   

9.
Kernel smoothing of spatial point data can often be improved using an adaptive, spatially varying bandwidth instead of a fixed bandwidth. However, computation with a varying bandwidth is much more demanding, especially when edge correction and bandwidth selection are involved. This paper proposes several new computational methods for adaptive kernel estimation from spatial point pattern data. A key idea is that a variable-bandwidth kernel estimator for d-dimensional spatial data can be represented as a slice of a fixed-bandwidth kernel estimator in \((d+1)\)-dimensional scale space, enabling fast computation using Fourier transforms. Edge correction factors have a similar representation. Different values of global bandwidth correspond to different slices of the scale space, so that bandwidth selection is greatly accelerated. Potential applications include estimation of multivariate probability density and spatial or spatiotemporal point process intensity, relative risk, and regression functions. The new methods perform well in simulations and in two real applications concerning the spatial epidemiology of primary biliary cirrhosis and the alarm calls of capuchin monkeys.  相似文献   

10.
Approximate Bayesian computation (ABC) using a sequential Monte Carlo method provides a comprehensive platform for parameter estimation, model selection and sensitivity analysis in differential equations. However, this method, like other Monte Carlo methods, incurs a significant computational cost as it requires explicit numerical integration of differential equations to carry out inference. In this paper we propose a novel method for circumventing the requirement of explicit integration by using derivatives of Gaussian processes to smooth the observations from which parameters are estimated. We evaluate our methods using synthetic data generated from model biological systems described by ordinary and delay differential equations. Upon comparing the performance of our method to existing ABC techniques, we demonstrate that it produces comparably reliable parameter estimates at a significantly reduced execution time.  相似文献   

11.
The stalactite plot for the detection of multivariate outliers   总被引:1,自引:0,他引:1  
Detection of multiple outliers in multivariate data using Mahalanobis distances requires robust estimates of the means and covariance of the data. We obtain this by sequential construction of an outlier free subset of the data, starting from a small random subset. The stalactite plot provides a cogent summary of suspected outliers as the subset size increases. The dependence on subset size can be virtually removed by a simulation-based normalization. Combined with probability plots and resampling procedures, the stalactite plot, particularly in its normalized form, leads to identification of multivariate outliers, even in the presence of appreciable masking.  相似文献   

12.
The effect of partial dependence in a binary sequence on tests for the presence of a changepoint or changed segment are investigated and exemplified in the context of modelling non-coding deoxyribonucleic acid (DNA). For the levels of dependence that are commonly seen in such DNA, the null distributions of the test statistics are approximately correct and so conclusions based on them are still valid. A strong dependence would, however, invalidate the use of such procedures.  相似文献   

13.
14.
A new method to calculate the multivariate t-distribution is introduced. We provide a series of substitutions, which transform the starting q-variate integral into one over the (q—1)-dimensional hypercube. In this situation standard numerical integration methods can be applied. Three algorithms are discussed in detail. As an application we derive an expression to calculate the power of multiple contrast tests assuming normally distributed data.  相似文献   

15.
This work presents a closed formula to compute any muitivariate factorized expected value from the knowledge of the joint cumulative distribution function (cdf) of any random variable. Additionally, a new nonparametric estimator alternative to the sample average is presented for the univariate case.  相似文献   

16.
d -dimensional random vector X is some nondegenerate d-variate normal distribution, on the basis of i.i.d. copies X 1, ..., X x of X. Particular emphasis is given to progress that has been achieved during the last decade. Furthermore, we stress the typical diagnostic pitfall connected with purportedly ‘directed’ procedures, such as tests based on measures of multivariate skewness. Received: April 30, 2001; revised version: October 30, 2001  相似文献   

17.
18.
ABSTRACT

This work presents advanced computational aspects of a new method for changepoint detection on spatio-temporal point process data. We summarize the methodology, based on building a Bayesian hierarchical model for the data and declaring prior conjectures on the number and positions of the changepoints, and show how to take decisions regarding the acceptance of potential changepoints. The focus of this work is about choosing an approach that detects the correct changepoint and delivers smooth reliable estimates in a feasible computational time; we propose Bayesian P-splines as a suitable tool for managing spatial variation, both under a computational and a model fitting performance perspective. The main computational challenges are outlined and a solution involving parallel computing in R is proposed and tested on a simulation study. An application is also presented on a data set of seismic events in Italy over the last 20 years.  相似文献   

19.
Various aspects of assessing multivariate normality are discussed. Practical recommendations are given, and areas of further research interest are noted.  相似文献   

20.
The product partition model (PPM) is a well-established efficient statistical method for detecting multiple change points in time-evolving univariate data. In this article, we refine the PPM for the purpose of detecting multiple change points in correlated multivariate time-evolving data. Our model detects distributional changes in both the mean and covariance structures of multivariate Gaussian data by exploiting a smaller dimensional representation of correlated multiple time series. The utility of the proposed method is demonstrated through experiments on simulated and real datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号