Distributions of the senses of words are often highly skewed. This fact is exploited by word sense disambiguation (WSD) systems which back off to the predominant (most frequent) s...
The goal of this article is to raise a debate on the usefulness of providing data mining models as services on the internet. These services can be provided by anyone with adequate...
Passages can be hidden within a text to circumvent their disallowed transfer. Such release of compartmentalized information is of concern to all corporate and governmental organiz...
Text categorization is a well-known task based essentially on statistical approaches using neural networks, Support Vector Machines and other machine learning algorithms. Texts are...
Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...