A new type of thesaurus for word processing is proposed. It comprises 7 semantic and 8 syntagmatic types of links between Russian words and collocations. The original version now ...
We present a novel extension to a recently proposed incremental learning algorithm for the word segmentation problem originally introduced in Goldwater (2006). By adding rejuvenat...
The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string se...
abstract appeared in: Proc. of 7th Ann. Int. Computing and Combinatorics Conference, COCOON 2001 (ed. J. Wang), Lecture Notes in Computer Science Vol. 2108, Springer-Verlag, Berlin...
We describe a new technique for reducing the number of nodes and symbols in automata based on tries. The technique stems from some results on anti-dictionaries for data compression...
Maxime Crochemore, Chiara Epifanio, Roberto Grossi...