首页 | 官方网站   微博 | 高级检索  
     


A Randomized Algorithm for Approximate String Matching
Authors:M. J. Atallah  F. Chyzak  P. Dumas
Affiliation:(1) CERIAS andDepartment of Computer Sciences, Purdue University, West Lafayette, IN 47907, USA. mja@cs.purdue.edu., US;(2) INRIA, Rocquencourt, B.P. 105, 78153 Le Chesnay Cedex, France. {Frederic.Chyzak, Philippe.Dumas}@inria.fr., FR
Abstract:We give a randomized algorithm in deterministic time O(Nlog  M) for estimating the score vector of matches between a text string of length N and a pattern string of length M , i.e., the vector obtained when the pattern is slid along the text, and the number of matches is counted for each position. A direct application is approximate string matching. The randomized algorithm uses convolution to find an estimator of the scores; the variance of the estimator is particularly small for scores that are close to M , i.e., for approximate occurrences of the pattern in the text. No assumption is made about the probabilistic characteristics of the input, or about the size of the alphabet. The solution extends to string matching with classes, class complements, ``never match' and ``always match' symbols, to the weighted case and to higher dimensions. Received July 20, 1997; revised April 20, 1998, and June 1, 1999.
Keywords:. Convolution   FFT   Approximate string matching   Randomized algorithms.
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号