首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Zero-inflated Poisson (ZIP) regression model is a popular approach to the analysis of count data with excess zeros. For correlated count data where the observations are either repeated or clustered outcomes from individual subjects, ZIP mixed regression model may be appropriate. However, ZIP model may often fail to fit such data either because of over-dispersion or because of under-dispersion in relation to the Poisson distribution. In this paper, we extend the ZIP mixed regression model to zero-inflated generalized Poisson (ZIGP) mixed regression model, where the base-line discrete distribution is generalized Poisson (GP) distribution, which is a natural extension of standard Poisson distribution. Furthermore, the random effects are considered in both zero-inflated and GP components throughout the paper. An EM algorithm for estimating parameters is proposed based on the best linear unbiased prediction-type (BLUP) log-likelihood and the residual maximum likelihood (REML). Meanwhile, several score tests are presented for testing the ZIP mixed regression model against the ZIGP mixed regression model, and for testing the significance of regression coefficients in zero-inflation and generalized Poisson portion. A numerical example is given to illustrate our methodology and the properties of score test statistics are investigated through Monte Carlo simulations.  相似文献   

2.
The proportion of explained variation (R2) is frequently used in the general linear model but in logistic regression no standard definition of R2 exists. We present a SAS macro which calculates two R2-measures based on Pearson and on deviance residuals for logistic regression. Also, adjusted versions for both measures are given, which should prevent the inflation of R2 in small samples.  相似文献   

3.
In clinical and epidemiologic research to investigate dose-response associations, non-parametric spline regression has long been proposed as a powerful alternative to conventional parametric regression approaches, since no underlying assumptions of linearity have to be fulfilled. For logistic spline models, however, to date, little standard statistical software is available to estimate any measure of risk, typically of interest when quantifying the effects of one or more continuous explanatory variable(s) on a binary disease outcome. In the present paper, we propose a set of SAS macros which perform non-parametric logistic regression analysis with B-spline expansions of an arbitrary number of continuous covariates, estimating adjusted odds ratios with respective confidence intervals for any given value with respect to a supplied reference value. Our SAS codes further allow to graphically visualize the shape of the association, retaining the exposure variable under consideration in its initial, continuous form while concurrently adjusting for multiple confounding factors. The macros are easily to use and can be implemented quickly by the clinical or epidemiological researcher to flexibly investigate any dose-response association of continuous exposures with the risk of binary disease outcomes. We illustrate the application of our SAS codes by investigating the effect of body-mass index on risk of cancer incidence in a large, population-based male cohort.  相似文献   

4.
We provide for generalized linear regression models that use natural cubic splines to model predictors an S-Plus function to calculate relative risks (RR), log relative risk (logRR), mean percent change (MPC) for continuous covariates modeled using a logarithmic link as well as adjusted means differences (MD) for the identity link. The function makes explicit use of the natural spline basis functions, the estimated coefficients for each natural spline basis function, and the fitted correlation matrix for the estimated coefficients and can thus accommodate any number of degrees of freedom. The main function produces a publication-quality graph of all of these quantities as compared to a user-specified reference value as well as the associated confidence limits. In another function, specific values of these statistics comparing a vector of values of the independent variable to the reference value can be calculated rather than plotted.  相似文献   

5.
In this paper we propose a new approach, called a fuzzy class model for Poisson regression, in the analysis of heterogeneous count data. On the basis of fuzzy set concept and fuzzy classification maximum likelihood (FCML) procedures we create an FCML algorithm for fuzzy class Poisson regression models. Traditionally, the EM algorithm had been used for latent class regression models. Thus, the accuracy and effectiveness of EM and FCML algorithms for estimating the parameters are compared. The results show that the proposed FCML algorithm presents better accuracy and effectiveness and can be used as another good tool to regression analysis for heterogeneous count data.This work was supported in part by the National Science Council of Taiwan under Grant NSC-89-2213-E-033-007.  相似文献   

6.
In this article, we consider some diagnostics for skew-normal nonlinear regression models with AR(1) errors which provide a useful extension of the normal regression models. The estimation of the parameters in the models is studied based on the EM algorithm. Meanwhile, several score tests are presented for testing the homogeneity of the scale parameter and/or significance of autocorrelation in skew-normal nonlinear regression models. The properties of score tests are investigated through Monte Carlo simulations. The test methods are illustrated with two numerical examples.  相似文献   

7.
8.
Variable selection for Poisson regression when the response variable is potentially underreported is considered. A logistic regression model is used to model the latent underreporting probabilities. An efficient MCMC sampling scheme is designed, incorporating uncertainty about which explanatory variables affect the dependent variable and which affect the underreporting probabilities. Validation data is required in order to identify and estimate all parameters. A simulation study illustrates favorable results both in terms of variable selection and parameter estimation. Finally, the procedure is applied to a real data example concerning deaths from cervical cancer.  相似文献   

9.
This paper presents simple large sample prediction intervals for a future response Yf given a vector xf of predictors when the regression model has the form Yi=m(xi)+ei where m is a function of xi and the errors ei are iid. Intervals with correct asymptotic coverage and shortest asymptotic length can be made by applying the shorth estimator to the residuals. Since residuals underestimate the errors, finite sample correction factors are needed.As an application, three prediction intervals are given for the least squares multiple linear regression model. The asymptotic coverage and length of these intervals and the classical estimator are derived. The new intervals are useful since the distribution of the errors does not need to be known, and simulations suggest that the large sample theory often provides good approximations for moderate sample sizes.  相似文献   

10.
In this paper, several diagnostics measures are proposed based on case-deletion model for log-Birnbaum-Saunders regression models (LBSRM), which might be a necessary supplement of the recent work presented by Galea et al. [2004. Influence diagnostics in log-Birnbaum-Saunders regression models. J. Appl. Statist. 31, 1049-1064] who studied the influence diagnostics for LBSRM mainly based on the local influence analysis. It is shown that the case-deletion model is equivalent to the mean-shift outlier model in LBSRM and an outlier test is presented based on mean-shift outlier model. Furthermore, we investigate a test of homogeneity for shape parameter in LBSRM, which is a problem mentioned by both Rieck and Nedelman [1991. A log-linear model for the Birnbaum-Saunders distribution. Technometrics 33, 51-60] and Galea et al. [2004. Influence diagnostics in log-Birnbaum-Saunders regression models. J. Appl. Statist. 31, 1049-1064]. We obtain the likelihood ratio and score statistics for such test. Finally, a numerical example is given to illustrate our methodology and the properties of likelihood ratio and score statistics are investigated through Monte Carlo simulations.  相似文献   

11.
Upper and lower regression models (dual possibilistic models) are proposed for data analysis with crisp inputs and interval or fuzzy outputs. Based on the given data, the dual possibilistic models can be derived from upper and lower directions, respectively, where the inclusion relationship between these two models holds. Thus, the inherent uncertainty existing in the given phenomenon can be approximated by the dual models. As a core part of possibilistic regression, firstly possibilistic regression for crisp inputs and interval outputs is considered where the basic dual linear models based on linear programming, dual nonlinear models based on linear programming and dual nonlinear models based on quadratic programming are systematically addressed, and similarities between dual possibilistic regression models and rough sets are analyzed in depth. Then, as a natural extension, dual possibilistic regression models for crisp inputs and fuzzy outputs are addressed.  相似文献   

12.
Input selection for nonlinear regression models   总被引:3,自引:0,他引:3  
A simple and effective method for the selection of significant inputs in nonlinear regression models is proposed. Given a set of input-output data and an initial superset of potential inputs, the relevant inputs are selected by checking whether after deleting a particular input, the data set is still consistent with the basic property of a function. In order to be able to handle real-valued and noisy data in a sensible manner, fuzzy clustering is first applied. The obtained clusters are compared by using a similarity measure in order to find inconsistencies within the data. Several examples using simulated and real-world data sets are presented to demonstrate the effectiveness of the algorithm.  相似文献   

13.
In survival analyses, we often estimate the hazard rate of a specific cause. Sometimes the main focus is not the hazard rates but the cumulative incidences, i.e., the probability of having failed from a specific cause prior to a given time. The cumulative incidences may be calculated using the hazard rates, and the hazard rates are often estimated by the Cox regression. This procedure may not be suitable for large studies due to limited computer resources. Instead one uses Poisson regression, which approximates the Cox regression. Rosth?j et al. presented a SAS-macro for the estimation of the cumulative incidences based on the Cox regression. I present the functional form of the probabilities and variances when using piecewise constant hazard rates and a SAS-macro for the estimation using Poisson regression. The use of the macro is demonstrated through examples and compared to the macro presented by Rosth?j et al.  相似文献   

14.
In clinical studies, covariates are often measured with error due to biological fluctuations, device error and other sources. Summary statistics and regression models that are based on mis-measured data will differ from the corresponding analysis based on the “true” covariate. Statistical analysis can be adjusted for measurement error, however various methods exhibit a tradeoff between convenience and performance. Moment Adjusted Imputation (MAI) is a measurement error in a scalar latent variable that is easy to implement and performs well in a variety of settings. In practice, multiple covariates may be similarly influenced by biological fluctuations, inducing correlated, multivariate measurement error. The extension of MAI to the setting of multivariate latent variables involves unique challenges. Alternative strategies are described, including a computationally feasible option that is shown to perform well.  相似文献   

15.
This note is motivated by recent works of Xie et al. (2009) and Xiang et al. (2007). Herein, we simplify the score statistic presented by Xie et al. (2009) to test overdispersion in the zero-inflated generalized Poisson (ZIGP) mixed model, and discuss an extension to test overdispersion in zero-inflated Poisson (ZIP) mixed models. Examples highlight the application of the extended results. The extensive simulation study for testing overdispersion in the Poisson mixed model indicates that the proposed score statistics maintain the nominal level reasonably well. In practice, the appropriate model is chosen based on the approximate mean-variance relationship in the data, and a formal score test based on asymptotic standard normal distribution can be employed for testing overdispersion. A case study is provided to illustrate procedures for data analysis.  相似文献   

16.
This paper focuses on the Bayesian posterior mean estimates (or Bayes’ estimate) of the parameter set of Poisson hidden Markov models in which the observation sequence is generated by a Poisson distribution whose parameter depends on the underlining discrete-time time-homogeneous Markov chain. Although the most commonly used procedures for obtaining parameter estimates for hidden Markov models are versions of the expectation maximization and Markov chain Monte Carlo approaches, this paper exhibits an algorithm for calculating the exact posterior mean estimates which, although still cumbersome, has polynomial rather than exponential complexity, and is a feasible alternative for use with small scale models and data sets. This paper also shows simulation results, comparing the posterior mean estimates obtained by this algorithm and the maximum likelihood estimates obtained by expectation maximization approach.  相似文献   

17.
Based on the soil line concept, various kinds of vegetation indices have been proposed to minimize soil background influences in the inventory of forest resources and the prediction of vegetation biomass. Unfortunately, those indicescan only reduce soil moisture effect on remote sensing data parallel to the axis, the direction of the so-called soil line, failing when different soil types appear (in the direction perpendicular to the soil line). A two-axis adjusted vegetation index is presented here to diminish most soil background influences. It is shown to be more suitable as a global monitoring vegetation index than other indices.  相似文献   

18.
19.
IntroductionSeveral statistical methods of assessing seasonal variation are available. Brookhart and Rothman [3] proposed a second-order moment-based estimator based on the geometrical model derived by Edwards [1], and reported that this estimator is superior in estimating the peak-to-trough ratio of seasonal variation compared with Edwards’ estimator with respect to bias and mean squared error. Alternatively, seasonal variation may be modelled using a Poisson regression model, which provides flexibility in modelling the pattern of seasonal variation and adjustments for covariates.MethodBased on a Monte Carlo simulation study three estimators, one based on the geometrical model, and two based on log-linear Poisson regression models, were evaluated in regards to bias and standard deviation (SD). We evaluated the estimators on data simulated according to schemes varying in seasonal variation and presence of a secular trend. All methods and analyses in this paper are available in the R package Peak2Trough [13].ResultsApplying a Poisson regression model resulted in lower absolute bias and SD for data simulated according to the corresponding model assumptions. Poisson regression models had lower bias and SD for data simulated to deviate from the corresponding model assumptions than the geometrical model.ConclusionThis simulation study encourages the use of Poisson regression models in estimating the peak-to-trough ratio of seasonal variation as opposed to the geometrical model.  相似文献   

20.
Quantile regression problems in practice may require flexible semiparametric forms of the predictor for modeling the dependence of responses on covariates. Furthermore, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal data. We present a unified approach for Bayesian quantile inference on continuous response via Markov chain Monte Carlo (MCMC) simulation and approximate inference using integrated nested Laplace approximations (INLA) in additive mixed models. Different types of covariate are all treated within the same general framework by assigning appropriate Gaussian Markov random field (GMRF) priors with different forms and degrees of smoothness. We applied the approach to extensive simulation studies and a Munich rental dataset, showing that the methods are also computationally efficient in problems with many covariates and large datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号