New standards in document representation, like for example SGML, XML, and MPEG-7, compel Information Retrieval to design and implement models and tools to index, retrieve and prese...
Fabio Crestani, Luis M. de Campos, Juan M. Fern&aa...
Abstract. We present a model for complex documents possibly consisting of a hierarchically structured set of images or texts. Documents are represented both at the form level (as s...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
Enterprise corpora contain evidence of what employees work on and therefore can be used to automatically find experts on a given topic. We present a general approach for represen...
The proliferation of text documents on the web as well as within institutions necessitates their convenient organization to enable efficient retrieval of information. Although tex...
Sriharsha Veeramachaneni, Diego Sona, Paolo Avesan...
In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. Experimenta...