Learning useful and predictable features from past workloads and exploiting them well is a major source of improvement in many operating system problems. We review known parallel ...
Information available in the Internet is frequently supplied simply as plain ascii text, structured according to orthographic and semantic conventions. Traditional document classi...
Many real-world datasets can be clustered along multiple dimensions. For example, text documents can be clustered not only by topic, but also by the author's gender or sentim...
We built a system for the automatic creation of a textbased topic hierarchy, meant to be used in a geographically defined community. This poses two main problems. First, the appea...
Text categorization is a well-known task based essentially on statistical approaches using neural networks, Support Vector Machines and other machine learning algorithms. Texts are...