We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of cla...
Most previous corpus-based algorithms disambiguate a word with a classifier trained from previous usages of the same word. Separate classifiers have to be trained for different wo...
We propose a unified framework in which to treat semantic underspecification and parallelism phenomena in discourse. The framework employs a constraint language that can express e...
This paper describes an approach to extract the aspectual information of Japanese verb phrases from a monolingual corpus. We classify Verbs into six categories by means of the asp...
This paper presents a trainable rule-based algorithm for performing word segmentation. The algorithm provides a simple, language-independent alternative to large-scale lexicai-bas...