We present a new visualization of the distance and cluster structure of high dimensional data. It is particularly well suited for analysis tasks of users unfamiliar with complex d...
This paper describes a system for efficient indexing and retrieval of words in collections of document images. The proposed method is based on two main principles: unsupervised pr...
A crucial preprocessing stage in applications such as OCR is text extraction from mixed-type documents. The present work, in contrast to most until now, successfully faces the pro...
This paper presents a methodology for learning taxonomic relations from a set of documents that each explain one of the concepts. Three different feature extraction approaches with...
– This paper describes a text categorization approach that is based on a combination of a newly designed text representation with a kNN classifier. The new text document represen...