首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
The Hartley‐Rao‐Cochran sampling design is an unequal probability sampling design which can be used to select samples from finite populations. We propose to adjust the empirical likelihood approach for the Hartley‐Rao‐Cochran sampling design. The approach proposed intrinsically incorporates sampling weights, auxiliary information and allows for large sampling fractions. It can be used to construct confidence intervals. In a simulation study, we show that the coverage may be better for the empirical likelihood confidence interval than for standard confidence intervals based on variance estimates. The approach proposed is simple to implement and less computer intensive than bootstrap. The confidence interval proposed does not rely on re‐sampling, linearization, variance estimation, design‐effects or joint inclusion probabilities.  相似文献   

2.
    
Item non‐response in surveys occurs when some, but not all, variables are missing. Unadjusted estimators tend to exhibit some bias, called the non‐response bias, if the respondents differ from the non‐respondents with respect to the study variables. In this paper, we focus on item non‐response, which is usually treated by some form of single imputation. We examine the properties of doubly robust imputation procedures, which are those that lead to an estimator that remains consistent if either the outcome variable or the non‐response mechanism is adequately modelled. We establish the double robustness property of the imputed estimator of the finite population distribution function under random hot‐deck imputation within classes. We also discuss the links between our approach and that of Chambers and Dunstan. The results of a simulation study support our findings.  相似文献   

3.
    
The authors develop jackknife and analytical variance estimators for the estimator of Chambers & Dunstan (1986) and Rao, Kovar & Mantel (1990) of the finite population distribution function, using complete auxiliary information. They also describe the associated model and show the design consistency of the variance estimators, whose small‐sample performance is examined through a limited simulation study. They highlight the operational advantages of the jackknife in the model‐based setting of Chambers & Dunstan (1986) and its better conditional performance in the design‐based setting of Rao, Kovar & Mantel (1990).  相似文献   

4.
    
Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

5.
    
Despite having desirable properties, model‐assisted estimators are rarely used in anything but their simplest form to produce official statistics. This is due to the fact that the more complicated models are often ill suited to the available auxiliary data. Under a model‐assisted framework, we propose a regression tree estimator for a finite‐population total. Regression tree models are adept at handling the type of auxiliary data usually available in the sampling frame and provide a model that is easy to explain and justify. The estimator can be viewed as a post‐stratification estimator where the post‐strata are automatically selected by the recursive partitioning algorithm of the regression tree. We establish consistency of the regression tree estimator and a variance estimator, along with asymptotic normality of the regression tree estimator. We compare the performance of our estimator to other survey estimators using the United States Bureau of Labor Statistics Occupational Employment Statistics Survey data.  相似文献   

6.
    
The author considers the use of auxiliary information available at population level to improve the estimation of finite population totals. She introduces a new type of model‐assisted estimator based on nonparametric regression splines. The estimator is a weighted linear combination of the study variable with weights calibrated to the B‐splines known population totals. The author shows that the estimator is asymptotically design‐unbiased and consistent under conditions which do not require the superpopulation model to be correct. She proposes a design‐based variance approximation and shows that the anticipated variance is asymptotically equivalent to the Godambe‐Joshi lower bound. She also shows through simulations that the estimator has good properties.  相似文献   

7.
Abstract. A model‐based predictive estimator is proposed for the population proportions of a polychotomous response variable, based on a sample from the population and on auxiliary variables, whose values are known for the entire population. The responses for the non‐sample units are predicted using a multinomial logit model, which is a parametric function of the auxiliary variables. A bootstrap estimator is proposed for the variance of the predictive estimator, its consistency is proved and its small sample performance is compared with that of an analytical estimator. The proposed predictive estimator is compared with other available estimators, including model‐assisted ones, both in a simulation study involving different sampling designs and model mis‐specification, and using real data from an opinion survey. The results indicate that the prediction approach appears to use auxiliary information more efficiently than the model‐assisted approach.  相似文献   

8.
    
In this paper, we propose a new generalized regression estimator for the problem of estimating the population total using unequal probability sampling without replacement. A modified automated linearization approach is applied in order to transform the proposed estimator to estimate variance of population total. The variance and estimated value of the variance of the proposed estimator is investigated under a reverse framework assuming that the sampling fraction is negligible and there are equal response probabilities for all units. We prove that the proposed estimator is an asymptotically unbiased estimator and that it does not require a known or estimated response probability to function.  相似文献   

9.
    
The technique introduced in this paper is a means for estimating and discovering underlying patterns for a large number of curves observed with heteroscedastic errors. Therefore, both the mean and the variance functions of each curve are assumed unknown and varying over time. The method consists of a series of steps. We transform using an orthonormal basis of functions in L 2. In the transform domain, the non-parametric regression is reduced to a means model. To estimate the means in the transform domain, we consider the class of linear or modulation estimators and proceed as in Beran and Dümbgen (R. Beran and L. Dümbgen, Modulation of estimators and confidence sets, Ann. Stat. 26(5) (1998), pp. 1826–1856.) by minimising the Stein's unbiased risk estimate. By minimising the risk over a nested subset selection of modulators, we reduce the dimensionality of the means space. We show that in the transform space, the risk estimate is asymptotically optimal in the Pinsker's minimax sense over Sobolev ellipsoids under heteroscedastic errors. Coefficient estimation and dimensionality reduction via optimal risk estimation is essential for accurate clustering membership estimation. We illustrate our technique by estimating and clustering a large number of curves both within a synthetic example and within a specific application. In this application, we analyse the research and development expenditure of a subset of companies in the Compustat Global database. We show that our method compares favourably to two alternative approaches.  相似文献   

10.
    
Much of the small‐area estimation literature focuses on population totals and means. However, users of survey data are often interested in the finite‐population distribution of a survey variable and in the measures (e.g. medians, quartiles, percentiles) that characterize the shape of this distribution at the small‐area level. In this paper we propose a model‐based direct estimator (MBDE, Chandra and Chambers) of the small‐area distribution function. The MBDE is defined as a weighted sum of sample data from the area of interest, with weights derived from the calibrated spline‐based estimate of the finite‐population distribution function introduced by Harms and Duchesne, under an appropriately specified regression model with random area effects. We also discuss the mean squared error estimation of the MBDE. Monte Carlo simulations based on both simulated and real data sets show that the proposed MBDE and its associated mean squared error estimator perform well when compared with alternative estimators of the area‐specific finite‐population distribution function.  相似文献   

11.
文章在响应变量随机缺失下,基于分位数回归研究了半参数模型的稳健估计问题。首先基于B样条基函数近似技术,将模型非参数函数的估计问题转化为样条系数向量估计问题;其次,在响应变量随机缺失下,提出了一种新的插补方法,对缺失的响应变量进行多重插补;再次,基于插补后的数据集,构造出新的分位数目标函数,得到模型非参数函数以及参数向量的稳健估计;最后给出了有效算法计算多重插补估计量。通过模拟研究验证了所提方法的有效性和稳健性。  相似文献   

12.
    
Clustering is one of the most used tools in data analysis. In the last decades, due to the increasing complexity of data, soft clustering has received a great deal of attention. There exist different approaches that can be considered as soft. The most known is the fuzzy approach that consists in assigning objects to clusters with membership degrees, depending on the dissimilarities between each object and all the prototypes, ranging in the unit interval. Closely related to the fuzzy approach, there is the possibilistic one that, differently from the previous one, relaxes some constraints on the membership degrees. In particular, the objects are assigned to clusters with degrees of typicalities, depending just on the dissimilarities between each object and the closest prototype. A further soft approach is the rough one. In this case, there are not degrees ranging between 0 and 1 but objects with intermediate features belong to the boundary region and are assigned to more than one cluster. Even if it is not universally recognized in the scientific community as an approach of soft clustering, from our point of view, the model‐based approach can also be considered as such. Model‐based clustering methods also produce a soft partition of the objects and the posterior probability of a component membership may play a role similar to the membership degree. The four approaches are critically described from a theoretical point of view and an empirical comparative analysis is carried out. This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification Statistical and Graphical Methods of Data Analysis > Multivariate Analysis Statistical Learning and Exploratory Methods of the Data Sciences > Exploratory Data Analysis  相似文献   

13.
《统计学通讯:理论与方法》2012,41(13-14):2305-2320
We consider shrinkage and preliminary test estimation strategies for the matrix of regression parameters in multivariate multiple regression model in the presence of a natural linear constraint. We suggest a shrinkage and preliminary test estimation strategies for the parameter matrix. The goal of this article is to critically examine the relative performances of these estimators in the direction of the subspace and candidate subspace restricted type estimators. Our analytical and numerical results show that the proposed shrinkage and preliminary test estimators perform better than the benchmark estimator under candidate subspace and beyond. The methods are also applied on a real data set for illustrative purposes.  相似文献   

14.
    
Two-phase regression models with inequality constraints on the regression coefficients and with a small number of measurements is considered. A new test based on the likelihood ratio in linear model with inequality constraints for the presence of a change-point is proposed. Numerical approximations to the powers against various alternatives are given and compared with the powers of the likelihood ratio test in the two-phase regression models without inequality constraints, the backwards CUSUM test, and the k-linear-r-ahead recursive residuals tests. Performance of related likelihood based estimators of the change-point is briefly studied in a Monte Carlo experiment.  相似文献   

15.
16.
    
When a sufficient correlation between the study variable and the auxiliary variable exists, the ranks of the auxiliary variable are also correlated with the study variable, and thus, these ranks can be used as an effective tool in increasing the precision of an estimator. In this paper, we propose a new improved estimator of the finite population mean that incorporates the supplementary information in forms of: (i) the auxiliary variable and (ii) ranks of the auxiliary variable. Mathematical expressions for the bias and the mean-squared error of the proposed estimator are derived under the first order of approximation. The theoretical and empirical studies reveal that the proposed estimator always performs better than the usual mean, ratio, product, exponential-ratio and -product, classical regression estimators, and Rao (1991 Rao, T.J. (1991). On certail methods of improving ration and regression estimators. Commun. Stat. Theory Methods 20(10):33253340.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]), Singh et al. (2009 Singh, R., Chauhan, P., Sawan, N., Smarandache, F. (2009). Improvement in estimating the population mean using exponential estimator in simple random sampling. Int. J. Stat. Econ. 3(A09):1318. [Google Scholar]), Shabbir and Gupta (2010 Shabbir, J., Gupta, S. (2010). On estimating finite population mean in simple and stratified random sampling. Commun. Stat. Theory Methods 40(2):199212.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]), Grover and Kaur (2011 Grover, L.K., Kaur, P. (2011). An improved estimator of the finite population mean in simple random sampling. Model Assisted Stat. Appl. 6(1):4755. [Google Scholar], 2014) estimators.  相似文献   

17.
    
In some applications, the failure time of interest is the time from an originating event to a failure event while both event times are interval censored. We propose fitting Cox proportional hazards models to this type of data using a spline‐based sieve maximum marginal likelihood, where the time to the originating event is integrated out in the empirical likelihood function of the failure time of interest. This greatly reduces the complexity of the objective function compared with the fully semiparametric likelihood. The dependence of the time of interest on time to the originating event is induced by including the latter as a covariate in the proportional hazards model for the failure time of interest. The use of splines results in a higher rate of convergence of the estimator of the baseline hazard function compared with the usual non‐parametric estimator. The computation of the estimator is facilitated by a multiple imputation approach. Asymptotic theory is established and a simulation study is conducted to assess its finite sample performance. It is also applied to analyzing a real data set on AIDS incubation time.  相似文献   

18.
    
There has been increasing use of quality-of-life (QoL) instruments in drug development. Missing item values often occur in QoL data. A common approach to solve this problem is to impute the missing values before scoring. Several imputation procedures, such as imputing with the most correlated item and imputing with a row/column model or an item response model, have been proposed. We examine these procedures using data from two clinical trials, in which the original asthma quality-of-life questionnaire (AQLQ) and the miniAQLQ were used. We propose two modifications to existing procedures: truncating the imputed values to eliminate outliers and using the proportional odds model as the item response model for imputation. We also propose a novel imputation method based on a semi-parametric beta regression so that the imputed value is always in the correct range and illustrate how this approach can easily be implemented in commonly used statistical software. To compare these approaches, we deleted 5% of item values in the data according to three different missingness mechanisms, imputed them using these approaches and compared the imputed values with the true values. Our comparison showed that the row/column-model-based imputation with truncation generally performed better, whereas our new approach had better performance under a number scenarios.  相似文献   

19.
    
This paper discusses an alternative approach to the estimation procedure presented in a recently published paper. The authors developed a model‐based clustering approach for regression time series and proposed the APECM procedure as an acceleration method for the expectation–maximization algorithm. The process of the estimation of model parameters was discussed in great detail. In this paper, we show how the proposed procedure can be modified to achieve substantial acceleration and better stability. In particular, numerical maximization suggested for the estimation of parameters can be replaced with analytical closed‐form expressions, and inverting high dimensional matrices can be avoided entirely. A convenient approach for assessing variability in parameter estimates is also provided. The results of conducted experiments are very promising. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2012  相似文献   

20.
Wild Bootstrapping in Finite Populations with Auxiliary Information   总被引:1,自引:0,他引:1  
Consider a finite population u , which can be viewed as a realization of a super-population model. A simple ratio model (linear regression, without intercept) with heteroscedastic errors is supposed to have generated u . A random sample is drawn without replacement from u . In this set-up a two-stage wild bootstrap resampling scheme as well as several other useful forms of bootstrapping in finite populations will be considered. Some asymptotic results for various bootstrap approximations for normalized and Studentized versions of the well-known ratio and regression estimator are given. Bootstrap based confidence interval s for the population total and for the regression parameter of the underlying ratio model are also discussed  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号

京公网安备 11010802026262号