The Burrows-Wheeler Transform (BWT) is a compression method which reorders an input string into the form, which is preferable to another compression. Usually Move-To-Front transfor...
Text clustering is potentially very useful for exploration of text sets that are too large to study manually. The success of such a tool depends on whether the results can be expl...
Abstract. We extend an automatically generated bilingual JapaneseSwedish dictionary with new translations, automatically discovered from the multi-lingual online encyclopedia Wikip...
This paper describes an accurate, extensible method for automatically classifying unknown foreign words that requires minimal monolingual resources and no bilingual training data ...
This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (co...