The paper describes the first version of the TextMOLE (Text Mining Operations Library and Environment) system for textual data mining. Currently TextMOLE acts as an advanced inde...
Abstract. We present the results of UMBC’s participation in the Web and Novelty tracks. We explored various heuristics-based link analysis approaches to the Topic Distillation ta...
Srikanth Kallurkar, Yongmei Shi, R. Scott Cost, Ch...
Category ranking provides a way to classify plain text documents into a pre-determined set of categories. This work proposes to have a look at typical document collections and ana...
We describe the objectives and organization of the CLEF 2005 ad hoc track and discuss the main characteristics of the tasks offered to test monolingual, bilingual and multilingual ...
Giorgio Maria Di Nunzio, Nicola Ferro, Gareth J. F...
—We present an OCR-driven writer identification algorithm in this paper. Our algorithm learns writer-specific characteristics more precisely from explicit character alignment usi...