In this paper, we present the multilingual Sense Folder Corpus. After the analysis of different corpora, we describe the requirements that have to be satisfied for evaluating sema...
With the aim of building a "Semantic Web", the content of the documents must be explicitly represented through metadata in order to enable contents-guided search. Our app...
We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...
Online services such as web search, news portals, and ecommerce applications face the challenge of providing highquality experiences to a large, heterogeneous user base. Recent ef...
This paper describes the design of a crawler devised to perform the periodic retrieval of Web documents for a search engine able to accept on-line updates in a concurrent manner. ...