首页 | 官方网站   微博 | 高级检索  
     

基于Hash算法的DNA序列k-mer index问题的数学建模
引用本文:郭方舟,华阳,董修伟,蔡志丹.基于Hash算法的DNA序列k-mer index问题的数学建模[J].长春理工大学学报,2015(5):116-119.
作者姓名:郭方舟  华阳  董修伟  蔡志丹
作者单位:长春理工大学 理学院,长春,130022
摘    要:针对查找DNA序列的相似序列问题,给出了建立索引和查找索引的数学模型,基于Hash算法,建立了依赖于k值大小的顺序索引模型和散列索引模型,特别对较大k值选用了DJBHash函数,有效的避免了Hash冲突问题。最后在硬件平台CPU为2.6GHz、内存为8G、操作系统为64位Windows 7的条件下,对100万条长度为100的DNA序列进行了测试,给出了不同k值下建立和查询索引的用时和占用内存情况,有效的解决了DNA序列的k-mer index问题。

关 键 词:Hash算法  索引问题  数学模型  复杂度分析

The Mathematical Model of k-mer Index for DNA Sequence Problem Based on Hash Algorithm
Abstract:In this paper, we give the mode of building and searching index for DNA similar sequences. Based on the Hash algorithm,we establish the order index model and Hash indexing model which depend on the size of the k val-ue. In orter to avoid Hash-Clash, we chose DJBhash fuction under the larger k value. Finally, we give the time of buliding and searching index and the memory occupation with different k value which CPU is 2.6GHz,memory is 8G, operation system is window 7 at 64 bit, at the same time, we test 1 million of DNA sequence with the length of 100,solve the k-mer index problem of DNA sequence effectively.
Keywords:Hash algorithm  index problem  mathematical model  complexity analysis
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号