基于Hash算法的DNA序列k-mer index问题的数学建模 The Mathematical Model of k-mer Index for DNA Sequence Problem Based on Hash Algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Hash算法的DNA序列k-mer index问题的数学建模

引用本文：	郭方舟,华阳,董修伟,蔡志丹.基于Hash算法的DNA序列k-mer index问题的数学建模[J].长春理工大学学报,2015(5):116-119.

作者姓名：	郭方舟华阳董修伟蔡志丹

作者单位：	长春理工大学理学院,长春,130022

摘要：	针对查找DNA序列的相似序列问题,给出了建立索引和查找索引的数学模型,基于Hash算法,建立了依赖于k值大小的顺序索引模型和散列索引模型,特别对较大k值选用了DJBHash函数,有效的避免了Hash冲突问题。最后在硬件平台CPU为2.6GHz、内存为8G、操作系统为64位Windows 7的条件下,对100万条长度为100的DNA序列进行了测试,给出了不同k值下建立和查询索引的用时和占用内存情况,有效的解决了DNA序列的k-mer index问题。
关键词：	Hash算法索引问题数学模型复杂度分析
The Mathematical Model of k-mer Index for DNA Sequence Problem Based on Hash Algorithm

Abstract:	In this paper, we give the mode of building and searching index for DNA similar sequences. Based on the Hash algorithm,we establish the order index model and Hash indexing model which depend on the size of the k val-ue. In orter to avoid Hash-Clash, we chose DJBhash fuction under the larger k value. Finally, we give the time of buliding and searching index and the memory occupation with different k value which CPU is 2.6GHz,memory is 8G, operation system is window 7 at 64 bit, at the same time, we test 1 million of DNA sequence with the length of 100,solve the k-mer index problem of DNA sequence effectively.

Keywords:	Hash algorithm index problem mathematical model complexity analysis
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏