Suffix trees are versatile data structures that are used for solving many string-matching problems. One of the main arguments against widespread usage of the structure is its spac...
A suffix vector of a string is an index data structure equivalent to a suffix tree. It was first introduced by Monostori et al. in 2001 [9, 10, 11]. They proposed a linear constru...
This paper analyses the efficiency of different data structures for detecting overlap in digital documents. Most existing approaches use some hash function to reduce the space req...
Phrase has been considered as a more informative feature term for improving the effectiveness of document clustering. In this paper, we propose a phrase-based document similarity t...