首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
Our rapidly growing knowledge regarding genetic variation in the human genome offers great potential for understanding the genetic etiology of disease. This, in turn, could revolutionize detection, treatment, and in some cases prevention of disease. While genes for most of the rare monogenic diseases have already been discovered, most common diseases are complex traits, resulting from multiple gene–gene and gene-environment interactions. Detecting epistatic genetic interactions that predispose for disease is an important, but computationally daunting, task currently facing bioinformaticists. Here, we propose a new evolutionary approach that attempts to hill-climb from large sets of candidate epistatic genetic features to smaller sets, inspired by Kauffman’s “random chemistry” approach to detecting small auto-catalytic sets of molecules from within large sets. Although the algorithm is conceptually straightforward, its success hinges upon the creation of a fitness function able to discriminate large sets that contain subsets of interacting genetic features from those that don’t. Here, we employ an approximate and noisy fitness function based on the ReliefF data mining algorithm. We establish proof-of-concept using synthetic data sets, where individual features have no marginal effects. We show that the resulting algorithm can successfully detect epistatic pairs from up to 1,000 candidate single nucleotide polymorphisms in time that is linear in the size of the initial set, although success rate degrades as heritability declines. Research continues into seeking a more accurate fitness approximator for large sets and other algorithmic improvements that will enable us to extend the approach to larger data sets and to lower heritabilities.  相似文献   

2.
Routine Discovery of Complex Genetic Models using Genetic Algorithms   总被引:1,自引:0,他引:1  
Simulation studies are useful in various disciplines for a number of reasons including the development and evaluation of new computational and statistical methods. This is particularly true in human genetics and genetic epidemiology where new analytical methods are needed for the detection and characterization of disease susceptibility genes whose effects are complex, nonlinear, and partially or solely dependent on the effects of other genes (i.e. epistasis or gene-gene interaction). Despite this need, the development of complex genetic models that can be used to simulate data is not always intuitive. In fact, only a few such models have been published. We have previously developed a genetic algorithm approach to discovering complex genetic models in which two single nucleotide polymorphisms (SNPs) influence disease risk solely through nonlinear interactions. In this paper, we extend this approach for the discovery of high-order epistasis models involving three to five SNPs. We demonstrate that the genetic algorithm is capable of routinely discovering interesting high-order epistasis models in which each SNP influences risk of disease only through interactions with the other SNPs in the model. This study opens the door for routine simulation of complex gene-gene interactions among SNPs for the development and evaluation of new statistical and computational approaches for identifying common, complex multifactorial disease susceptibility genes.  相似文献   

3.
4.
Detecting and visualizing nonlinear interactive effects of Single Nucleotide Polymorphisms (SNPs) or epistatic interactions are important topics of signal processing having great mathematical and computational challenges. To address these problems, a three-stage method, epiMiner (epistasis Miner), is proposed based on co-information theory. In screening stage, Co-Information Index (CII) is employed to visualize and rank contributions of individual SNPs to the phenotype, with the number of top ranking SNPs retained to next stage specified by users directly or a support vector machine classifier automatically. In testing stage, co-information and co-information based permutation test are conducted sequentially to search epistatic interactions within the retained SNPs, with the results then ranked by their p-values. For further characterizing broader epistasis landscape, a visualizing stage is designed to dynamically construct epistasis networks by linking pairs of the retained SNPs if their co-information values with respect to the phenotype are stronger than thresholds. The performance of epiMiner is compared with existing methods on a diverse range of simulated data sets containing several epistasis models. Results demonstrate that epiMiner is effective in detecting and visualizing epistatic interactions. In addition, the application of epiMiner on a real Age-related Macular Degeneration (AMD) data set provides several new clues for the exploration of causative factors of AMD. The Matlab version of epiMiner software is available free online at https://sourceforge.net/projects/epiminer/files/.  相似文献   

5.
The building block hypothesis implies that genetic algorithm (GA) effectiveness is influenced by the relative location of epistatic genes on the chromosome. We demonstrate this effect in four experiments, where chromosomes with adjacent epistatic genes provide improved results over chromosomes with separated epistatic genes. We also show that information-theoretic reconstructability analysis can be used to decide on optimal gene ordering.  相似文献   

6.
Interaction detection in large-scale genetic association studies has attracted intensive research interest, since many diseases have complex traits. Various approaches have been developed for finding significant genetic interactions. In this article, we propose a novel framework SRMiner to detect interacting susceptible and protective genotype patterns. SRMiner can discover not only probable combination of single nucleotide polymorphisms (SNPs) causing diseases but also the corresponding SNPs suppressing their pathogenic functions, which provides a better prospective to uncover the underlying relevance between genetic variants and complex diseases. We have performed extensive experiments on several real Wellcome Trust Case Control Consortium (WTCCC) datasets. We use the pathway-based and the protein-protein interaction (PPI) network-based evaluation methods to verify the discovered patterns. The results show that SRMiner successfully identifies many disease-related genes verified by the existing work. Furthermore, SRMiner can also infer some uncomfirmed but highly possible disease-related genes.  相似文献   

7.
The post replying behavior in online communities (OCs) has garnered little consideration, even though the feedback behavior represents the central social dynamic of OCs and greatly determines the vibrancy of OCs. To fill this gap, this study aims to identify major sharing post-related variables that explain the heterogeneity in the post replying behavior in knowledge sharing OCs. The research model is validated through a panel dataset assembled from an online travel community. The results reveal that sharing post length and vividness, contributors’ expertise and degree centrality, and members’ social interactions have significant associations with the number of replying posts.  相似文献   

8.
基于蚁群算法的基因联接学习遗传算法   总被引:1,自引:0,他引:1  
论文提出了一种基于蚁群算法的基因联接学习遗传算法。在该算法中遗传算法的种群对应于蚁群,遗传算法的染色体同时是蚁群算法的一只蚂蚁。在每一次进行交叉或突变操作时,算法首先根据蚁群算法的信息素矩阵计算父代个体的基因间联接强度,然后根据该联接强度选择交叉和突变位点。这样可以避免积木块过多地被遗传操作所破坏,减少遗传算法的搜索空间,并指引寻优的方向。联接学习在该算法中是并行进行的,而在Harik的算法中是串行进行的;该算法的编码长度不会随着等位基因数量的增加而成倍地增加。文章通过有界难度问题和TSP问题的实验研究验证了算法的有效性。  相似文献   

9.
随着发病率的逐年上升,糖尿病正日益成为严峻的世界健康难题,尤其是在发展中国家,其中大部分的糖尿病患者是2型糖尿病. 经过科学验证:通过及时有效的诊断,大约80%的2型糖尿病并发症能被阻止或者延缓. 基于大规模不平衡数据集,提出一种集成模型用于精准地诊断糖尿病患者. 数据集包含了中国某省从2009年到2015年数百万人的医疗记录. 实验结果证明该方法具有良好的性能,并取得了91.00%的敏感度,58.24%的F3值以及86.69%的G-mean值.  相似文献   

10.
Genome-wide association studies (GWAS) involve the detection and interpretation of epistasis, which is responsible for the ‘missing heritability’ and influences common complex disease susceptibility. Many epistasis detection algorithms cannot be directly applied into GWAS as many combinations of genetic components are present in only a small amount of samples or even none at all. For a huge number of single nucleotide polymorphisms and inappropriate statistical tests, epistasis detection remains a computational and statistical challenge in genetic epidemiology. Here, we develop a novel method to identify epistatic interactions related to disease susceptibility utilizing an ant colony optimization strategy implemented by Google's MapReduce platform. We incorporate expert knowledge used to guide ants to make the best choice in the search process into the pheromone updating rule. We conduct sufficient experiments using simulated and real genome-wide data sets and experimental results demonstrate excellent performance of our algorithm compared with its competitors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号