We discuss the idea of modelling the statistical distributions of scores of documents, classified as relevant or non-relevant. Various specific combinations of standard statistic...
The semantic annotation of documents is an additional advantage for retrieval, as long as the annotations and their maintenance process scale well. Automatic or semi-automatic ann...
We propose three heuristics to determine the country of origin of a person or institution via text-based IE from the Web. We evaluate all methods on a collection of music artists ...
Markus Schedl, Klaus Seyerlehner, Dominik Schnitze...
Web search engines are facing formidable performance challenges as they need to process thousands of queries per second over billions of documents. To deal with this heavy workloa...
Proximity searching consists in retrieving from a database, objects that are close to a query. For this type of searching problem, the most general model is the metric space, where...