首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Family studies to identify disease-related genes frequently collect only families with multiple cases. It is often desirable to determine if risk factors that are known to influence disease risk in the general population also play a role in the study families. If so, these factors should be incorporated into the genetic analysis to control for confounding. Pfeiffer et al. [2001 Biometrika 88: 933-948] proposed a variance components or random effects model to account for common familial effects and for different genetic correlations among family members. After adjusting for ascertainment, they found maximum likelihood estimates of the measured exposure effects. Although it is appealing that this model accounts for genetic correlations as well as for the ascertainment of families, in order to perform an analysis one needs to specify the distribution of random genetic effects. The current work investigates the robustness of the proposed model with respect to various misspecifications of genetic random effects in simulations. When the true underlying genetic mechanism is polygenic with a small dominant component, or Mendelian with low allele frequency and penetrance, the effects of misspecification on the estimation of fixed effects in the model are negligible. The model is applied to data from a family study on nasopharyngeal carcinoma in Taiwan.  相似文献   

2.
The presence of measurement errors affecting the covariates in regression models is a relevant topic in many scientific areas, as, for example, in epidemiology. An example is given by an epidemiological population-based matched case-control study on the aetiology of childhood malignancies, which is currently under completion in Italy. This study was aimed at evaluating the effects of childhood exposure to extremely low electromagnetic fields on the risk of disease occurrence by taking into account the possibility of erroneous measures of the exposure. Within this framework, we focus on the application of likelihood methods to correct for measurement error. This approach, which has received less attention in literature with respect to alternatives, is compared with commonly used methods such as regression calibration and SIMEX. The comparison is performed by simulation, under a broad range of measurement error structures.  相似文献   

3.
Family-based case-control studies are popularly used to study the effect of genes and gene-environment interactions in the etiology of rare complex diseases. We consider methods for the analysis of such studies under the assumption that genetic susceptibility (G) and environmental exposures (E) are independently distributed of each other within families in the source population. Conditional logistic regression, the traditional method of analysis of the data, fails to exploit the independence assumption and hence can be inefficient. Alternatively, one can estimate the multiplicative interaction between G and E more efficiently using cases only, but the required population-based G-E independence assumption is very stringent. In this article, we propose a novel conditional likelihood framework for exploiting the within-family G-E independence assumption. This approach leads to a simple and yet highly efficient method of estimating interaction and various other risk parameters of scientific interest. Moreover, we show that the same paradigm also leads to a number of alternative and even more efficient methods for analysis of family-based case-control studies when parental genotype information is available on the case-control study participants. Based on these methods, we evaluate different family-based study designs by examining their relative efficiencies to each other and their efficiencies compared to a population-based case-control design of unrelated subjects. These comparisons reveal important design implications. Extensions of the methodologies for dealing with complex family studies are also discussed.  相似文献   

4.
This paper is concerned with the problem of estimating the demand for health care with panel data. A random effects model is specified within a semiparametric Bayesian approach using a Dirichlet process prior. This results in a very flexible distribution for both the random effects and the count variable. In particular, the model can be seen as a mixture distribution with a random number of components, and is therefore a natural extension of prevailing latent class models. A full Bayesian analysis using Markov chain Monte Carlo simulation methods is proposed. The methodology is illustrated with an application using data from Germany.  相似文献   

5.
In matched case‐crossover studies, it is generally accepted that the covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model. This is because any stratum effect is removed by the conditioning on the fixed number of sets of the case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. However, some matching covariates such as time often play an important role as an effect modification leading to incorrect statistical estimation and prediction. Therefore, we propose three approaches to evaluate effect modification by time. The first is a parametric approach, the second is a semiparametric penalized approach, and the third is a semiparametric Bayesian approach. Our parametric approach is a two‐stage method, which uses conditional logistic regression in the first stage and then estimates polynomial regression in the second stage. Our semiparametric penalized and Bayesian approaches are one‐stage approaches developed by using regression splines. Our semiparametric one stage approach allows us to not only detect the parametric relationship between the predictor and binary outcomes, but also evaluate nonparametric relationships between the predictor and time. We demonstrate the advantage of our semiparametric one‐stage approaches using both a simulation study and an epidemiological example of a 1‐4 bi‐directional case‐crossover study of childhood aseptic meningitis with drinking water turbidity. We also provide statistical inference for the semiparametric Bayesian approach using Bayes Factors. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

6.
When modeling the risk of a disease, the very act of selecting the factors to be included can heavily impact the results. This study compares the performance of several variable selection techniques applied to logistic regression. We performed realistic simulation studies to compare five methods of variable selection: (1) a confidence interval (CI) approach for significant coefficients, (2) backward selection, (3) forward selection, (4) stepwise selection, and (5) Bayesian stochastic search variable selection (SSVS) using both informed and uniformed priors. We defined our simulated diseases mimicking odds ratios for cancer risk found in the literature for environmental factors, such as smoking; dietary risk factors, such as fiber; genetic risk factors, such as XPD; and interactions. We modeled the distribution of our covariates, including correlation, after the reported empirical distributions of these risk factors. We also used a null data set to calibrate the priors of the Bayesian method and evaluate its sensitivity. Of the standard methods (95 per cent CI, backward, forward, and stepwise selection) the CI approach resulted in the highest average per cent of correct associations and the lowest average per cent of incorrect associations. SSVS with an informed prior had a higher average per cent of correct associations and a lower average per cent of incorrect associations than the CI approach. This study shows that the Bayesian methods offer a way to use prior information to both increase power and decrease false-positive results when selecting factors to model complex disease risk.  相似文献   

7.
Rice K 《Statistics in medicine》2003,22(20):3177-3194
We consider analysis of matched case-control studies where a binary exposure is potentially misclassified, and there may be a variety of matching ratios. The parameter of interest is the ratio of odds of case exposure to control exposure. By extending the conditional model for perfectly classified data via a random effects or Bayesian formulation, we obtain estimates and confidence intervals for the misclassified case which reduce back to standard analytic forms as the error probabilities reduce to zero. Several examples are given, highlighting different analytic phenomena. In a simulation study, using mixed matching ratios, the coverage of the intervals are found to be good, although point estimates are slightly biased on the log scale. Extensions of the basic model are given allowing for uncertainty in the knowledge of misclassification rates, and the inclusion of prior information about the parameter of interest.  相似文献   

8.
A new goodness-of-fit test for the logistic regression model is proposed. It exploits the property of this model that when it is correct, i.e. not misspecified, the parameter estimates are (asymptotically) invariant under reweighting the observations by weights wi that are a function of the binary (0/1) outcomes yi. Misspecification of the model can thus be concluded when parameter estimates change under reweighting. A local test, considering weights of the form wi=(1 + epsilonyi) is explored. The test is especially suitable for case-control studies but may be used in other contexts as well.  相似文献   

9.
This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one‐dimensional integrals that may be solved numerically. Good properties of the resulting estimator are described in a small simulation study. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

10.
We employ a general bias preventive approach developed by Firth (Biometrika 1993; 80:27-38) to reduce the bias of an estimator of the log-odds ratio parameter in a matched case-control study by solving a modified score equation. We also propose a method to calculate the standard error of the resultant estimator. A closed-form expression for the estimator of the log-odds ratio parameter is derived in the case of a dichotomous exposure variable. Finite sample properties of the estimator are investigated via a simulation study. Finally, we apply the method to analyze a matched case-control data from a low birthweight study.  相似文献   

11.
We propose a semiparametric multivariate skew–normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within‐subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew–normal distribution to specify the within‐subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis–Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within‐subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

12.
In epidemiology, one approach to investigating the dependence of disease risk on an explanatory variable in the presence of several confounding variables is by fitting a binary regression using a conditional likelihood, thus eliminating the nuisance parameters. When the explanatory variable is measured with error, the estimated regression coefficient is biased usually towards zero. Motivated by the need to correct for this bias in analyses that combine data from a number of case-control studies of lung cancer risk associated with exposure to residential radon, two approaches are investigated. Both employ the conditional distribution of the true explanatory variable given the measured one. The method of regression calibration uses the expected value of the true given measured variable as the covariate. The second approach integrates the conditional likelihood numerically by sampling from the distribution of the true given measured explanatory variable. The two approaches give very similar point estimates and confidence intervals not only for the motivating example but also for an artificial data set with known properties. These results and some further simulations that demonstrate correct coverage for the confidence intervals suggest that for studies of residential radon and lung cancer the regression calibration approach will perform very well, so that nothing more sophisticated is needed to correct for measurement error.  相似文献   

13.
Estimation and testing of genetic effects (genotype relative risks) are often performed conditionally on parental genotypes, using data from case-parent trios. This strategy avoids having to estimate nuisance parameters such as parental mating type frequencies, and also avoids generating spurious results due to confounding causes of association such as population stratification. For effects at a single locus, the resulting analysis is equivalent to matched case/control analysis via conditional logistic regression, using the case and three "pseudocontrols" derived from the untransmitted parental alleles. We previously showed that a similar approach can be used for analyzing genotype and haplotype effects at a set of closely linked loci, but with a required adjustment to the conditioning argument that results in varying numbers of pseudocontrols, depending on the disease model that is to be fitted. Here we extend this method to include the analysis of epistatic effects (gene-gene interactions) at unlinked loci, to include parent-of-origin effects at one or more loci, and to allow additional incorporation of gene-environment interactions. The conditional logistic approach provides a natural and flexible framework for incorporating these additional effects. By relaxing the conditioning on parental genotypes to allow exchangeability of parental genotypes, we show how the power of this approach can be increased when studying parent-of-origin effects. Simulations suggest that there is limited power to distinguish between parent-of-origin effects and effects due to interaction between genotypes of mother and child.  相似文献   

14.
We propose a method to analyze family‐based samples together with unrelated cases and controls. The method builds on the idea of matched case–control analysis using conditional logistic regression (CLR). For each trio within the family, a case (the proband) and matched pseudo‐controls are constructed, based upon the transmitted and untransmitted alleles. Unrelated controls, matched by genetic ancestry, supplement the sample of pseudo‐controls; likewise unrelated cases are also paired with genetically matched controls. Within each matched stratum, the case genotype is contrasted with control/pseudo‐control genotypes via CLR, using a method we call matched‐CLR (mCLR). Eigenanalysis of numerous SNP genotypes provides a tool for mapping genetic ancestry. The result of such an analysis can be thought of as a multidimensional map, or eigenmap, in which the relative genetic similarities and differences amongst individuals is encoded in the map. Once constructed, new individuals can be projected onto the ancestry map based on their genotypes. Successful differentiation of individuals of distinct ancestry depends on having a diverse, yet representative sample from which to construct the ancestry map. Once samples are well‐matched, mCLR yields comparable power to competing methods while ensuring excellent control over Type I error. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

15.
The genetic case-control association study of unrelated subjects is a leading method to identify single nucleotide polymorphisms (SNPs) and SNP haplotypes that modulate the risk of complex diseases. Association studies often genotype several SNPs in a number of candidate genes; we propose a two-stage approach to address the inherent statistical multiple comparisons problem. In the first stage, each gene's association with disease is summarized by a single p-value that controls a familywise error rate. In the second stage, summary p-values are adjusted for multiplicity using a false discovery rate (FDR) controlling procedure. For the first stage, we consider marginal and joint tests of SNPs and haplotypes within genes, and we construct an omnibus test that combines SNP and haplotype analysis. Simulation studies show that when disease susceptibility is conferred by a SNP, and all common SNPs in a gene are genotyped, marginal analysis of SNPs using the Simes test has similar or higher power than marginal or joint haplotype analysis. Conversely, haplotype analysis can be more powerful when disease susceptibility is conferred by a haplotype. The omnibus test tracks the more powerful of the two approaches, which is generally unknown. Multiple testing balances the desire for statistical power against the implicit costs of false positive results, which up to now appear to be common in the literature.  相似文献   

16.
In an individually matched case-control study, effects of potential risk factors are ascertained through conditional logistic regression (CLR). Extension of CLR to situations with multiple disease or reference categories has been made through polychotomous CLR and is shown to be more efficient than carrying out separate CLRs for each subgroup. In this paper, we consider matched case-control studies where there is one control group, but there are multiple disease states with a natural ordering among themselves. This scenario can be observed when the cases can be further classified in terms of the seriousness or progression of the disease, for example, according to different stages of cancer. We explore several popular models for ordered categorical data in this context. We first adopt a cumulative logit or equivalently, a proportional-odds model to account for the ordinal nature of the data. The important distinction of this model from a stratified dichotomous and polychotomous logistic regression model is that the stratum-specific nuisance parameters cannot be eliminated in this model via the conditional-likelihood approach. We discuss a Mantel-Haenszel approach for analysing such data. We point out possible difficulties with standard likelihood-based approaches with the cumulative logit model when applied to case-control data. We then consider an alternative conditional adjacent-category logit model. We illustrate the methods by analysing data from a matched case-control study on low birthweight in newborns where infants are classified according to low and very low birthweight and a child with normal birthweight serves as a control. A simulation study compares the different ordinal methods with methods ignoring sub-classification of the ordered disease states.  相似文献   

17.
In randomized clinical trials, it is common that patients may stop taking their assigned treatments and then switch to a standard treatment (standard of care available to the patient) but not the treatments under investigation. Although the availability of limited retrieved data on patients who switch to standard treatment, called off‐protocol data, could be highly valuable in assessing the associated treatment effect with the experimental therapy, it leads to a complex data structure requiring the development of models that link the information of per‐protocol data with the off‐protocol data. In this paper, we develop a novel Bayesian method to jointly model longitudinal treatment measurements under various dropout scenarios. Specifically, we propose a multivariate normal mixed‐effects model for repeated measurements from the assigned treatments and the standard treatment, a multivariate logistic regression model for those stopping the assigned treatments, logistic regression models for those starting a standard treatment off protocol, and a conditional multivariate logistic regression model for completely withdrawing from the study. We assume that withdrawing from the study is non‐ignorable, but intermittent missingness is assumed to be at random. We examine various properties of the proposed model. We develop an efficient Markov chain Monte Carlo sampling algorithm. We analyze in detail via the proposed method a real dataset from a clinical trial. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
For analyses of longitudinal repeated‐measures data, statistical methods include the random effects model, fixed effects model and the method of generalized estimating equations. We examine the assumptions that underlie these approaches to assessing covariate effects on the mean of a continuous, dichotomous or count outcome. Access to statistical software to implement these models has led to widespread application in numerous disciplines. However, careful consideration should be paid to their critical assumptions to ascertain which model might be appropriate in a given setting. To illustrate similarities and differences that might exist in empirical results, we use a study that assessed depressive symptoms in low‐income pregnant women using a structured instrument with up to five assessments that spanned the pre‐natal and post‐natal periods. Understanding the conceptual differences between the methods is important in their proper application even though empirically they might not differ substantively. The choice of model in specific applications would depend on the relevant questions being addressed, which in turn informs the type of design and data collection that would be relevant. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

19.
For a dense set of genetic markers such as single nucleotide polymorphisms (SNPs) on high linkage disequilibrium within a small candidate region, a haplotype-based approach for testing association between a disease phenotype and the set of markers is attractive in reducing the data complexity and increasing the statistical power. However, due to unknown status of the underlying disease variant, a comprehensive association test may require consideration of various combinations of the SNPs, which often leads to severe multiple testing problems. In this paper, we propose a latent variable approach to test for association of multiple tightly linked SNPs in case-control studies. First, we introduce a latent variable into the penetrance model to characterize a putative disease susceptible locus (DSL) that may consist of a marker allele, a haplotype from a subset of the markers, or an allele at a putative locus between the markers. Next, through using of a retrospective likelihood to adjust for the case-control sampling ascertainment and appropriately handle the Hardy-Weinberg equilibrium constraint, we develop an expectation-maximization (EM)-based algorithm to fit the penetrance model and estimate the joint haplotype frequencies of the DSL and markers simultaneously. With the latent variable to describe a flexible role of the DSL, the likelihood ratio statistic can then provide a joint association test for the set of markers without requiring an adjustment for testing of multiple haplotypes. Our simulation results also reveal that the latent variable approach may have improved power under certain scenarios comparing with classical haplotype association methods.  相似文献   

20.
Measurements on subjects in longitudinal medical studies are often collected at several different times or under different experimental conditions. Such multiple observations on the same subject generally produce serially correlated outcomes. Traditional regression methods assume that observations within subjects are independent which is not true in longitudinal data. In this paper we develop a Bayesian analysis for the traditional non-linear random effects models with errors that follow a continuous time autoregressive process. In this way, unequally spaced observations do not present a problem in the analysis. Parameter estimation of this model is done via the Gibbs sampling algorithm. The method is illustrated with data coming from a study in pregnant women in Santiago, Chile, that involves the non-linear regression of plasma volume on gestational age.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号