Due to the globalization on the Web, many companies and institutions need to efficiently organize and search repositories containing multilingual documents. The management of the...
– This paper describes a text categorization approach that is based on a combination of a newly designed text representation with a kNN classifier. The new text document represen...
To date, few attempts have been made to develop and validate methods for automatic evaluation of linguistic quality in text summarization. We present the first systematic assessme...
Metric distances and the more general concept of dissimilarities are widely used tools in instance-based learning methods and very especially in the nearestneighbor classification...
Background: Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We...