– This paper describes a text categorization approach that is based on a combination of a newly designed text representation with a kNN classifier. The new text document represen...
Representing lexicons and sentences with the subsymbolic approach (using techniques such as Self Organizing Map (SOM) or Artificial Neural Network (ANN)) is a relatively new but i...
This paper addresses the indexing and retrieval of mathematical symbols from digitized documents. The proposed approach exploits Shape Contexts (SC) to describe the shape of mathe...
We present a new visualization of the distance and cluster structure of high dimensional data. It is particularly well suited for analysis tasks of users unfamiliar with complex d...
We propose a novel and efficient solution to the problem of clustering XML documents based on their structure. We use operations on multisets of paths of document trees to define...