首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
本文研究了两种ZIP模型,比较了一般ZIP回归模型和ZIP(τ)模型对相同的零膨胀(zero-inflated)计数数据的拟合效果,发现前者在拟合时有较好的稳定性,并在此基础上得出实际数据中两种模型应用的技巧以及需要注意的问题。同时研究了ZIP模型在保险费率厘定中的应用,将这两种模型分别应用于保险数据,发现在拟合和预测两方面,ZIP回归模型都明显优于ZIP(τ)模型。  相似文献   

2.
若时间序列的观测值中存在大部分零值和一些正值,并且正值服从某一连续分布时,常见方法的拟合效果可能不太好.为此,2020年Harvey和Ito提出了一种删失方法,该方法将原分布向左移动,即将原随机变量减去一个常数,并将得到的负值赋值为0,但他们采用的广义第二类Beta分布有一定的局限性.文章考虑了更一般的加权广义第二类Beta分布,采用条件得分方法,提出了删失加权广义第二类Beta动态模型.文章将这个模型应用到澳大利亚日降雨量数据中,并与删失未加权、零增广加权、零增广未加权的广义第二类Beta动态模型相比较,发现删失加权的模型要优于其它三种模型.  相似文献   

3.
纵向数据常常用正态混合效应模型进行分析.然而,违背正态性的假定往往会导致无效的推断.与传统的均值回归相比较,分位回归可以给出响应变量条件分布的完整刻画,对于非正态误差分布也可以给稳健的估计结果.本文主要考虑右删失响应下纵向混合效应模型的分位回归估计和变量选择问题.首先,逆删失概率加权方法被用来得到模型的参数估计.其次,结合逆删失概率加权和LASSO惩罚变量选择方法考虑了模型的变量选择问题.蒙特卡洛模拟显示所提方法要比直接删除删失数据的估计方法更具优势.最后,分析了一组艾滋病数据集来展示所提方法的实际应用效果.  相似文献   

4.
纵向数据常常用正态混合效应模型进行分析.然而,违背正态性的假定往往会导致无效的推断.与传统的均值回归相比较,分位回归可以给出响应变量条件分布的完整刻画,对于非正态误差分布也可以给稳健的估计结果.本文主要考虑右删失响应下纵向混合效应模型的分位回归估计和变量选择问题.首先,逆删失概率加权方法被用来得到模型的参数估计.其次,结合逆删失概率加权和LASSO惩罚变量选择方法考虑了模型的变量选择问题.蒙特卡洛模拟显示所提方法要比直接删除删失数据的估计方法更具优势.最后,分析了一组艾滋病数据集来展示所提方法的实际应用效果.  相似文献   

5.
在临床数据的收集中,由于竞争性风险或者病人的退出可能导致数据删失.删失数据的统计分析大多是基于独立删失的假定进行的.而实际情况中,数据的删失往往是非独立的,即删失变量和失效时间变量是相关的.相依删失使得原本复杂的删失数据处理变得更加困难.在本文中,假定删失变量和失效时间变量的联合分布可以用它们边际分布的连接函数函数表示,在给定连接函数下,得到了比例风险模型的极大似然估计.模拟计算显示,如果删失假定成立,本文所采用方法比独立删失假定下的估计方法更准确.  相似文献   

6.
来源于不同总体的数据异质性较大,数据“零取值”较多且离散度大,可利用零膨胀泊松(ZIP)混合回归模型建模分析,然而混合模型中自变量较多.为了筛选出重要变量,本文利用自适应LASSO对ZIP混合回归模型进行变量选择,即在似然函数中加入惩罚项,再利用EM算法估计参数.通过模拟,验证了该方法在变量选择和参数估计中的有效性.同时,将ZIP混合回归模型应用于预测借贷失败次数的实际数据分析,筛选出对借贷失败有重要影响的因素.最后,通过比较各模型的预测效果,得到ZIP混合回归模型优于泊松(Poisson),负二项(NB)和ZIP回归模型.  相似文献   

7.
城市轨道交通触网故障后果严重,但发生频次较少,数据分析困难.零膨胀计数模型(ZIM)对零值大的数据集具有良好的适用性.针对上海地铁近4年运营过程中积累的触网故障数据进行统计分析,采用ZIM模型中运用最广泛的ZIP模型和ZINB模型进行建模,对比模型的4项评价指标,并进行模型命中率、泛化能力、释义合理性的评价.研究表明,ZINB模型能够对触网故障数据进行更好的拟合.基于模型结果,对城市轨道交通触网系统的安全运营策略及维修保养制度提出建议.  相似文献   

8.
本文考虑了长度偏差右删失数据下均值剩余寿命模型的统计推断.当截断变量满足平稳性假设时,长度偏差右删失数据比左截断右删失数据具有更多的信息.为了提高参数估计的效率,我们在估计方程构造中添加了额外信息,通过组合方法获得了新的估计.模拟研究的结果也表明,组合估计方程的方法比仅考虑左截断右删失数据的方法更有效,结果表现更好.  相似文献   

9.
厉诚博  胡淑兰  周勇 《数学学报》2018,61(5):865-880
本文考虑了长度偏差右删失数据下均值剩余寿命模型的统计推断.当截断变量满足平稳性假设时,长度偏差右删失数据比左截断右删失数据具有更多的信息.为了提高参数估计的效率,我们在估计方程构造中添加了额外信息,通过组合方法获得了新的估计.模拟研究的结果也表明,组合估计方程的方法比仅考虑左截断右删失数据的方法更有效,结果表现更好.  相似文献   

10.
陈敏  K.C.Yune  朱力行 《中国科学A辑》2002,32(11):961-974
研究随机删失部分线性回归模型的假设检验问题. 提出了一个检验统计量来检验数据是否满足一个部分线性回归模型, 它是基于残差的cusum过程的平方形式. 研究了零假设下和局部对立假设下检验统计量的渐近分布. 数值模拟表明该检验方法有好的检验功效.  相似文献   

11.
零膨胀广义泊松回归模型与保险费率厘定   总被引:1,自引:0,他引:1  
在保险产品的分类费率厘定中,最常使用的模型之一是泊松回归模型.当损失数据存在零膨胀(zero-in flated)特征时,通常会采用零膨胀泊松回归模型.在零膨胀泊松回归模型中,一般假设结构零的比例参数φ为常数,不受费率因子的影响,这有可能背离实际情况.假设参数φ与费率因子之间存在一定关系,并在此基础上建立了零膨胀广义泊松回归模型,即Z IGP(τ)回归模型.通过对一组汽车保险损失数据的拟合表明,Z IGP(τ)回归模型可以有效地改善对实际数据的拟合效果,从而提高费率厘定结果的合理性.  相似文献   

12.
A first-order INteger-valued AutoRegressive (INAR) process with zero-inflated Poisson distributed innovations was proposed by Jazi, Jones and Lai (2012) [First-order integer valued AR processes with zero inflated Poisson innovations. Journal of Time Series Analysis. 33, 954–963.], which is able for dealing with zero-inflated/deflated count time series data. The inferential aspects of this model were not well explored by the authors, only a conditional maximum likelihood approach was briefly discussed. In this paper, we explore the inferential aspects of this zero-inflated Poisson INAR(1) process. We propose parameter estimation through Two-Step Conditional Least Squares and Yule–Walker methods. The asymptotic properties of the estimators are provided. Simulation results about the finite-sample behavior of both estimation methods and comparisons with the conditional maximum likelihood approach are presented under correct model specification and misspecification. Two empirical applications to real data sets are considered in order to illustrate the usefulness of the proposed methodology in practical situations.  相似文献   

13.
Non-negative matrix factorization (NMF) is a technique of multivariate analysis used to approximate a given matrix containing non-negative data using two non-negative factor matrices that has been applied to a number of fields. However, when a matrix containing non-negative data has many zeroes, NMF encounters an approximation difficulty. This zero-inflated situation occurs often when a data matrix is given as count data, and becomes more challenging with matrices of increasing size. To solve this problem, we propose a new NMF model for zero-inflated non-negative matrices. Our model is based on the zero-inflated Tweedie distribution. The Tweedie distribution is a generalization of the normal, the Poisson, and the gamma distributions, and differs from each of the other distributions in the degree of robustness of its estimated parameters. In this paper, we show through numerical examples that the proposed model is superior to the basic NMF model in terms of approximation of zero-inflated data. Furthermore, we show the differences between the estimated basis vectors found using the basic and the proposed NMF models for \(\beta \) divergence by applying it to real purchasing data.  相似文献   

14.
Count data with excess zeros encountered in many applications often exhibit extra variation. Therefore, zero-inflated Poisson (ZIP) model may fail to fit such data. In this paper, a zero-inflated double Poisson model (ZIDP), which is generalization of the ZIP model, is studied and the score tests for the significance of dispersion and zero-inflation in ZIDP model are developed. Meanwhile, this work also develops homogeneous tests for dispersion and/or zero-inflation parameter, and corresponding score test statistics are obtained. One numerical example is given to illustrate our methodology and the properties of score test statistics are investigated through Monte Carlo simulations.  相似文献   

15.
Count data with excess zeros are often encountered in many medical, biomedical and public health applications. In this paper, an extension of zero-inflated Poisson mixed regression models is presented for dealing with multilevel data set, referred as hierarchical mixture zero-inflated Poisson mixed regression models. A stochastic EM algorithm is developed for obtaining the ML estimates of interested parameters and a model comparison is also considered for comparing models with different latent classes through BIC criterion. An application to the analysis of count data from a Shanghai Adolescence Fitness Survey and a simulation study illustrate the usefulness and effectiveness of our methodologies.  相似文献   

16.
The theory of tree-growing (RECPAM approach) is developed for outcome variables which are distributed as the canonical exponential family. The general RECPAM approach (consisting of three steps: recursive partition, pruning and amalgamation), is reviewed. This is seen as constructing a partition with maximal information content about a parameter to be predicted, followed by simplification by the elimination of ‘negligible’ information. The measure of information is defined for an exponential family outcome as a deviance difference, and appropriate modifications of pruning and amalgamation rules are discussed. It is further shown how the proposed approach makes it possible to develop tree-growing for situations usually treated by generalized linear models (GLIM). In particular, Poisson and logistic regression can be tree-structured. Moreover, censored survival data can be treated, as in GLIM, by observing a formal equivalence of the likelihood under random censoring and an appropriate Poisson model. Three examples are given of application to Poisson, binary and censored survival data.  相似文献   

17.
In applications involving count data, it is common to encounter an excess number of zeros. In the study of outpatient service utilization, for example, the number of utilization days will take on integer values, with many subjects having no utilization (zero values). Mixed-distribution models, such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB), are often used to fit such data. A more general class of mixture models, called hurdle models, can be used to model zero-deflation as well as zero-inflation. Several authors have proposed frequentist approaches to fitting zero-inflated models for repeated measures. We describe a practical Bayesian approach which incorporates prior information, has optimal small-sample properties, and allows for tractable inference. The approach can be easily implemented using standard Bayesian software. A study of psychiatric outpatient service use illustrates the methods.  相似文献   

18.
For nonnegative measurements such as income or sick days, zero counts often have special status. Furthermore, the incidence of zero counts is often greater than expected for the Poisson model. This article considers a doubly semiparametric zero-inflated Poisson model to fit data of this type, which assumes two partially linear link functions in both the mean of the Poisson component and the probability of zero. We study a sieve maximum likelihood estimator for both the regression parameters and the nonparametric functions. We show, under routine conditions, that the estimators are strongly consistent. Moreover, the parameter estimators are asymptotically normal and first order efficient, while the nonparametric components achieve the optimal convergence rates. Simulation studies suggest that the extra flexibility inherent from the doubly semiparametric model is gained with little loss in statistical efficiency. We also illustrate our approach with a dataset from a public health study.  相似文献   

19.
Customized personal rate offering is of growing importance in the insurance industry. To achieve this, an important step is to identify subgroups of insureds from the corresponding heterogeneous claim frequency data. In this paper, a penalized Poisson regression approach for subgroup analysis in claim frequency data is proposed. Subjects are assumed to follow a zero-inflated Poisson regression model with group-specific intercepts, which capture group characteristics of claim frequency. A penalized likelihood function is derived and optimized to identify the group-specific intercepts and effects of individual covariates. To handle the challenges arising from the optimization of the penalized likelihood function, an alternating direction method of multipliers algorithm is developed and its convergence is established. Simulation studies and real applications are provided for illustrations.  相似文献   

20.
In this article, we consider a semiparametric zero-inflated Poisson mixed model that postulates a possible nonlinear relationship between the natural logarithm of the mean of the counts and a particular covariate in the longitudinal studies. A penalized log-likelihood function is proposed and Monte Carlo expectation-maximization algorithm is used to derive the estimates. Under some mild conditions, we establish the consistency and asymptotic normality of the resulting estimators. Simulation studies are carried out to investigate the finite sample performance of the proposed method. For illustration purposes, the method is applied to a data set from a pharmaceutical company where the variable of interest is the number of episodes of side effects after the patient has taken the treatments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号