Abstract. Syllable based text compression is a new approach to compression by symbols. In this concept syllables are used as the compression symbols instead of the more common char...
This paper proposes a method for automatic POS (part-of-speech) guessing of Chinese unknown words. It contains two models. The first model uses a machinelearning method to predict...
For a very long time, it has been considered that the only way of automatically extracting similar groups of words from a text collection for which no semantic information exists ...
In this paper we explore an unsupervised approach to classify video content by analyzing the corresponding subtitles. The proposed method is based on the WordNet lexical database a...
Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and...