PROBLEM: Automatic keyword assignment has been largely studied in medical informatics in the context of the MEDLINE database, both for helping search in MEDLINE and in order to pr...
In this paper we investigate ChineseEnglish name transliteration using comparable corpora, corpora where texts in the two languages deal in some of the same topics -- and therefor...
Feature selection methods have been successfully applied to text categorization but seldom applied to text clustering due to the unavailability of class label information. In this...
In this paper, we propose a new classification method that addresses classification in multiple categories of textual documents. We call it Matrix Regression (MR) due to its resem...
Iulian Sandu Popa, Karine Zeitouni, Georges Gardar...
This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification...