Sciweavers

DAWAK
2007
Springer

An Efficient Algorithm for Identifying the Most Contributory Substring

13 years 8 months ago
An Efficient Algorithm for Identifying the Most Contributory Substring
Abstract. Detecting repeated portions of strings has important applications to many areas of study including data compression and computational biology. This paper defines and presents a solution for the Most Contributory Substring Problem, which identifies the single substring that represents the largest proportion of the characters within a set of strings. We show that a solution to the problem can be achieved with an O(n) running time (where n is the total number of characters in all of the input strings) when overlapping occurrences of the most contributory substring are permitted. Furthermore, we present an extended algorithm that does not permit occurrences of the most contributory substring to overlap. The expected running time of the extended algorithm is O(n log n) while its worst case performance is O(n2 ).
Ben Stephenson
Added 14 Aug 2010
Updated 14 Aug 2010
Type Conference
Year 2007
Where DAWAK
Authors Ben Stephenson
Comments (0)