On Optimally Partitioning a Text to Improve Its Compression

15 years 10 months ago

Download www.di.unipi.it

In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a basecompressor C gets a compressed output that is shorter than applying C over the entire T at once. This problem was introduced in [2, 3] in the context of table compression, and then further elaborated and extended to strings and trees by [10, 11, 21]. Unfortunately, the literature oﬀers poor solutions: namely, we know either a cubic-time algorithm for computing the optimal partition based on dynamic programming [3, 15], or few heuristics that do not guarantee any bounds on the eﬃcacy of their computed partition [2, 3], or algorithms that are eﬃcient but work in some speciﬁc scenarios (such as the Burrows-Wheeler Transform, see e.g. [10, 21]) and achieve compression performance that might be worse than the optimal-partitioning by a Ω( √ log n) factor. Therefore, computing efﬁciently the optimal solution is still open [4]. In this paper ...

Paolo Ferragina, Igor Nitto, Rossano Venturini

Real-time Traffic