Efficient Trie-Based Sorting of Large Sets of Strings

11 years 5 months ago
Efficient Trie-Based Sorting of Large Sets of Strings
Sorting is a fundamental algorithmic task. Many generalpurpose sorting algorithms have been developed, but efficiency gains can be achieved by designing algorithms for specific kinds of data, such as strings. In previous work we have shown that our burstsort, a trie-based algorithm for sorting strings, is for large data sets more efficient than all previous algorithms for this task. In this paper we re-evaluate some of the implementation details of burstsort, in particular the method for managing buckets held at leaves. We show that better choice of data structures further improves the efficiency, at a small additional cost in memory. For sets of around 30,000,000 strings, our improved burstsort is nearly twice as fast as the previous best sorting algorithm.
Ranjan Sinha, Justin Zobel
Added 23 Aug 2010
Updated 23 Aug 2010
Type Conference
Year 2003
Where ACSC
Authors Ranjan Sinha, Justin Zobel
Comments (0)