Greedy Algorithms for the Shortest Common Superstring That Are Asymptotically Optimal |
| |
Authors: | A Frieze W Szpankowski |
| |
Affiliation: | (1) Department of Mathematics, Carnegie Mellon University, Pittsburgh, PA 15213, USA. af1p@andrew.cmu.edu., US;(2) Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA. spa@cs.purdue.edu., US |
| |
Abstract: | There has recently been a resurgence of interest in the shortest common superstring problem due to its important applications in molecular biology (e.g., recombination of DNA) and data compression. The problem
is NP-hard, but it has been known for some time that greedy algorithms work well for this problem. More precisely, it was
proved in a recent sequence of papers that in the worst case a greedy algorithm produces a superstring that is at most β times (2≤β≤ 4 ) worse than optimal. We analyze the problem in a probabilistic framework, and consider the optimal total overlap O
n
opt
and the overlap O
n
rg
produced by various greedy algorithms. These turn out to be asymptotically equivalent. We show that with high probability
, where n is the number of original strings and H is the entropy of the underlying alphabet. Our results hold under a condition that the lengths of all strings are not too
short.
Received November 1996; revised March 1997. |
| |
Keywords: | , Common superstring, Greedy heuristics, Mixing model, Tries, Greedy matching, |
本文献已被 SpringerLink 等数据库收录! |
|