首页 | 官方网站   微博 | 高级检索  
     

单体型装配问题及其算法
引用本文:王瑞省,吴凌云,张继红,章祥荪.单体型装配问题及其算法[J].高校应用数学学报(A辑),2004,19(Z1):515-528.
作者姓名:王瑞省  吴凌云  张继红  章祥荪
作者单位:中国科学院,数学与系统科学研究院应用数学研究所,北京,100080
基金项目:Supported by the National Natural Science Foundation of China (10471141) and the National Postdoctoral Foundation of China.
摘    要:单核苷酸多态性(SNP)单体型装配问题就是从给定的来自某人染色体的SNP片段中去除错误,重构出尽可能与原来片段一致的单体型.这个问题有几个不同的模型最少片段去除(MFR)问题,最少SNP去除(MSR)问题以及最少错误纠正(MEC)问题.前两个问题的复杂性与算法已有一些学者研究过.第三个问题已被证明是NP完全问题,但这个问题的实际算法还没有.该文对MEC问题给出了一个分支定界算法,这个算法能得到问题的全局最优解.通过这个算法对实际数据的计算说明了MEC模型的合理性,即在一定条件下,通过修正最少的错误重构出的单体型确实是真实的单体型.由于分支定界算法对这样一个NP完全问题不能在可接受的时间内解规模较大的问题,文中又给出了求解MEC问题的两个基于动态聚类的算法,以便对规模较大的问题在可接受的时间内得到近似最优解.数值实际表明这两个算法很快,很有效.这两个算法总能得到与分支定界找到的全局最优解很接近的近似最优解.鉴于MEC问题是NP完全的,这两个算法是有效的、实际的算法.

关 键 词:分支定界  动态聚类  单体型装配  SNP  MEC问题
修稿时间:2004年7月8日

ALGORITHMS FOR SNP HAPLOTYPE ASSEMBLY PROBLEM
WANG Rui-sheng,WU Ling-yun,ZHANG Ji-hong,ZHANG Xiang-sun.ALGORITHMS FOR SNP HAPLOTYPE ASSEMBLY PROBLEM[J].Applied Mathematics A Journal of Chinese Universities,2004,19(Z1):515-528.
Authors:WANG Rui-sheng  WU Ling-yun  ZHANG Ji-hong  ZHANG Xiang-sun
Abstract:The SNP haplotype assembly problem is to reconstruct a maximally consistent pair of haplotypes by removing errors from a set of SNP fragments of one individual's DNA.There are several models for the problem: Minimum Fragment Removal(MFR)problem,Minimum SNP Removal(MSR)problem,and Minimum Error Correction(MEC)problem.Complexity and practical algorithms for the first two problems have been studied by several authors.The third problem has been proved to be NP-complete,but there are few practical algorithms for it.In this paper,a branch and bound algorithm to solve the MEC problem for globally optimal solutions is given.And this algorithm is used to illustrate biological rationality of MEC model according to computational results based on real data,i.e.,under certain conditions the haplotypes generated through minimum number of error correction are indeed the real haplotypes.Since the branch and bound algorithm can not solve the large size problems within acceptable time, two algorithms based on dynamic clustering for MEC problem are given to obtain approximately-optimal solutions.Numerical experiments show that the algorithms are very fast and have so good performance that they can find good solutions.Since MEC is an NP-hard problem,these algorithms are effective and practical.
Keywords:branch and bound  dynamic clustering  haplotype assembly  SNP  MEC  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号