首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a general family of mixture hazard models to analyze lifetime data associated with bathtub and multimodal hazard functions. With this model we have a great flexibility for fitting lifetime data. Its version with covariates has the proportional hazard and the accelerated failure time models as special cases. A Bayesian analysis is presented for the model using informative priors, using sampling‐based approaches to perform the Bayesian computations. A real example with a medical data illustrates the methodology.  相似文献   

2.
Distance sampling is a technique for estimating the abundance of animals or other objects in a region, allowing for imperfect detection. This paper evaluates the statistical efficiency of the method when its assumptions are met, both theoretically and by simulation. The theoretical component of the paper is a derivation of the asymptotic variance penalty for the distance sampling estimator arising from uncertainty about the unknown detection parameters. This asymptotic penalty factor is tabulated for several detection functions. It is typically at least 2 but can be much higher, particularly for steeply declining detection rates. The asymptotic result relies on a model which makes the strong assumption that objects are uniformly distributed across the region. The simulation study relaxes this assumption by incorporating over-dispersion when generating object locations. Distance sampling and strip transect estimators are calculated for simulated data, for a variety of overdispersion factors, detection functions, sample sizes and strip widths. The simulation results confirm the theoretical asymptotic penalty in the non-overdispersed case. For a more realistic overdispersion factor of 2, distance sampling estimation outperforms strip transect estimation when a half-normal distance function is correctly assumed, confirming previous literature. When the hazard rate model is correctly assumed, strip transect estimators have lower mean squared error than the usual distance sampling estimator when the strip width is close enough to its optimal value (± 75% when there are 100 detections; ± 50% when there are 200 detections). Whether the ecologist can set the strip width sufficiently accurately will depend on the circumstances of each particular study.  相似文献   

3.
ANDERSON and POSPAHALA (1970) investigated the estimation of wildlife population size using the belt or line transect sampling method and devised a correction for bias, thus leading to an estimator with interesting characteristics. This work was given a uniform mathematical framework in BURNHAM and ANDERSON (1976). In this paper we extend that mathematical framework to several different sampling models, and a number of interesting discrete probability distributions emerge.  相似文献   

4.
5.
Distance sampling is a widely used method to estimate animal population size. Most distance sampling models utilize a monotonically decreasing detection function such as a half-normal. Recent advances in distance sampling modeling allow for the incorporation of covariates into the distance model, and the elimination of the assumption of perfect detection at some fixed distance (usually the transect line) with the use of double-observer models. The assumption of full observer independence in the double-observer model is problematic, but can be addressed by using the point independence assumption which assumes there is one distance, the apex of the detection function, where the 2 observers are assumed independent. Aerially collected distance sampling data can have a unimodal shape and have been successfully modeled with a gamma detection function. Covariates in gamma detection models cause the apex of detection to shift depending upon covariate levels, making this model incompatible with the point independence assumption when using double-observer data. This paper reports a unimodal detection model based on a two-piece normal distribution that allows covariates, has only one apex, and is consistent with the point independence assumption when double-observer data are utilized. An aerial line-transect survey of black bears in Alaska illustrate how this method can be applied.  相似文献   

6.
7.
Variable Selection for Clustering with Gaussian Mixture Models   总被引:3,自引:0,他引:3  
Summary .  This article is concerned with variable selection for cluster analysis. The problem is regarded as a model selection problem in the model-based cluster analysis context. A model generalizing the model of Raftery and Dean (2006,  Journal of the American Statistical Association   101, 168–178) is proposed to specify the role of each variable. This model does not need any prior assumptions about the linear link between the selected and discarded variables. Models are compared with Bayesian information criterion. Variable role is obtained through an algorithm embedding two backward stepwise algorithms for variable selection for clustering and linear regression. The model identifiability is established and the consistency of the resulting criterion is proved under regularity conditions. Numerical experiments on simulated datasets and a genomic application highlight the interest of the procedure.  相似文献   

8.
Sampling Variances of Heterozygosity and Genetic Distance   总被引:76,自引:11,他引:65       下载免费PDF全文
Mathematical formulae for the sampling variances of average heterozygosity and Nei's genetic distance are developed. These sampling variances are decomposed into their two components, i.e. the inter-locus and intra-locus variances. The relationship between the number of loci and the number of individuals per locus to be examined for estimating average heterozygosity and genetic distance is also discussed. The utility of the inter-locus variance of heterozygosity for studying the mechanism of maintenance of genetic variability in populations is indicated.  相似文献   

9.
A condition for practical independence of contact distribution functions in Boolean models is obtained. This result allows the authors to use maximum likelihcod methods, via sparse sampling, for estimating unknown parameters of an isotropic Boolean model. The second part of this paper is devoted to a simulation study of the proposed method. AMS classification: 60D05  相似文献   

10.
A vast literature has recently been concerned with the analysis of variation in disease counts recorded across geographical areas with the aim of detecting clusters of regions with homogeneous behavior. Most of the proposed modeling approaches have been discussed for the univariate case and only very recently spatial models have been extended to predict more than one outcome simultaneously. In this paper we extend the standard finite mixture models to the analysis of multiple, spatially correlated, counts. Dependence among outcomes is modeled using a set of correlated random effects and estimation is carried out by numerical integration through an EM algorithm without assuming any specific parametric distribution for the random effects. The spatial structure is captured by the use of a Gibbs representation for the prior probabilities of component membership through a Strauss‐like model. The proposed model is illustrated using real data (© 2009 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

11.
Summary .  We consider a fully model-based approach for the analysis of distance sampling data. Distance sampling has been widely used to estimate abundance (or density) of animals or plants in a spatially explicit study area. There is, however, no readily available method of making statistical inference on the relationships between abundance and environmental covariates. Spatial Poisson process likelihoods can be used to simultaneously estimate detection and intensity parameters by modeling distance sampling data as a thinned spatial point process. A model-based spatial approach to distance sampling data has three main benefits: it allows complex and opportunistic transect designs to be employed, it allows estimation of abundance in small subregions, and it provides a framework to assess the effects of habitat or experimental manipulation on density. We demonstrate the model-based methodology with a small simulation study and analysis of the Dubbo weed data set. In addition, a simple ad hoc method for handling overdispersion is also proposed. The simulation study showed that the model-based approach compared favorably to conventional distance sampling methods for abundance estimation. In addition, the overdispersion correction performed adequately when the number of transects was high. Analysis of the Dubbo data set indicated a transect effect on abundance via Akaike's information criterion model selection. Further goodness-of-fit analysis, however, indicated some potential confounding of intensity with the detection function.  相似文献   

12.
13.
Phylogenetic mixture models are statistical models of character evolution allowing for heterogeneity. Each of the classes in some unknown partition of the characters may evolve by different processes, or even along different trees. Such models are of increasing interest for data analysis, as they can capture the variety of evolutionary processes that may be occurring across long sequences of DNA or proteins. The fundamental question of whether parameters of such a model are identifiable is difficult to address, due to the complexity of the parameterization. Identifiability is, however, essential to their use for statistical inference.  相似文献   

14.
Biochemical demands constrain the range of amino acids acceptable at specific sites resulting in across-site compositional heterogeneity of the amino acid replacement process. Phylogenetic models that disregard this heterogeneity are prone to systematic errors, which can lead to severe long-branch attraction artifacts. State-of-the-art models accounting for across-site compositional heterogeneity include the CAT model, which is computationally expensive, and empirical distribution mixture models estimated via maximum likelihood (C10–C60 models). Here, we present a new, scalable method EDCluster for finding empirical distribution mixture models involving a simple cluster analysis. The cluster analysis utilizes specific coordinate transformations which allow the detection of specialized amino acid distributions either from curated databases or from the alignment at hand. We apply EDCluster to the HOGENOM and HSSP databases in order to provide universal distribution mixture (UDM) models comprising up to 4,096 components. Detailed analyses of the UDM models demonstrate the removal of various long-branch attraction artifacts and improved performance compared with the C10–C60 models. Ready-to-use implementations of the UDM models are provided for three established software packages (IQ-TREE, Phylobayes, and RevBayes).  相似文献   

15.
This paper extends the multilevel survival model by allowing the existence of cured fraction in the model. Random effects induced by the multilevel clustering structure are specified in the linear predictors in both hazard function and cured probability parts. Adopting the generalized linear mixed model (GLMM) approach to formulate the problem, parameter estimation is achieved by maximizing a best linear unbiased prediction (BLUP) type log‐likelihood at the initial step of estimation, and is then extended to obtain residual maximum likelihood (REML) estimators of the variance component. The proposed multilevel mixture cure model is applied to analyze the (i) child survival study data with multilevel clustering and (ii) chronic granulomatous disease (CGD) data on recurrent infections as illustrations. A simulation study is carried out to evaluate the performance of the REML estimators and assess the accuracy of the standard error estimates.  相似文献   

16.
17.
Zebrafish models have significantly contributed to our understanding of vertebrate development and, more recently, human disease. The growing number of genetic tools available in zebrafish research has resulted in the identification of many genes involved in developmental and disease processes. In particular, studies in the zebrafish have clarified roles of the p53 tumor suppressor in the formation of specific tumor types, as well as roles of p53 family members during embryonic development. The zebrafish has also been instrumental in identifying novel mechanisms of p53 regulation and highlighting the importance of these mechanisms in vivo. This article will summarize how zebrafish models have been used to reveal numerous, important aspects of p53 function.The zebrafish, Danio rerio, is a small model organism that has long been used to study vertebrate development. Zebrafish embryos are optically clear and develop externally to the mother, facilitating the study of early developmental processes. In addition, zebrafish have increasingly been used in modeling human diseases, including a number of cancers. The availability of forward and reverse genetic tools in the zebrafish has resulted in the identification and characterization of many genes involved in development and disease. One gene that has been extensively studied is the p53 tumor suppressor gene, which is structurally and functionally conserved in the zebrafish. This article will discuss how studies in the zebrafish have increased our understanding of how p53 contributes to the formation of specific tumor types, resulted in the identification of novel mechanisms of p53 regulation, and showed how p53 and p53 family members are involved in embryonic development.  相似文献   

18.
ABSTRACT The Mahalanobis distance statistic (D2) has emerged as an effective tool to identify suitable habitat from presence data alone, but there has been no mechanism to select among potential habitat covariates. We propose that the best combination of explanatory variables for a D2 model can be identified by ranking potential models based on the proportion of the entire study area that is classified as potentially suitable habitat given that a predetermined proportion of occupied locations are correctly classified. In effect, our approach seeks to minimize errors of commission, or maximize specificity, while holding the omission error rate constant. We used this approach to identify potentially suitable habitat for the Olympic marmot (Marmota olympus), a declining species endemic to Olympic National Park, Washington, USA. We compared models built with all combinations of 11 habitat variables. A 7-variable model identified 21,143 ha within the park as potentially suitable for marmots, correctly classifying 80% of occupied locations. Additional refinements to the 7-variable model (e.g., eliminating small patches) further reduced the predicted area to 18,579 ha with little reduction in predictive power. Although we sought a model that would allow field workers to find 80% of Olympic marmot locations, in fact, <3% of 376 occupied locations and <9% of abandoned locations were >100 m from habitat predicted by the final model, suggesting that >90% of occupied marmot habitat could be found by observant workers surveying predicted habitat. The model comparison procedure allowed us to identify the suite of covariates that maximized specificity of our model and, thus, limited the amount of less favorable habitat included in the final prediction area. We expect that by maximizing specificity of models built from presence-only data, our model comparison procedure will be useful to conservation practitioners planning reintroductions, searching for rare species, or identifying habitat for protection.  相似文献   

19.
Mixture modeling applications in psychology often include covariates to explain class membership and aid in construct validation of the latent classification variable. These applications tend to use between-class models involving only main effects of predictors. However, a variety of developmental theories posit interactions among risk and protective variables in predicting membership in trajectory classes or behavioral symptom profiles. This article bridges this disconnect between substantive theory and methodological practice by presenting and comparing two approaches for testing interactive effects of predictors on class membership: product term (PT) and multiple group (MG) approaches. For each approach, we discuss alternative interpretation strategies involving predicted probabilities and odds ratios; we also discuss when the approaches provide equivalent inferences. Published longitudinal and cross-sectional mixture model applications that had originally allowed for only additive effects on class membership are re-analyzed to illustrate the testing and interpretation of interactive effects on class membership using both PT and MG approaches.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号