This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
The traditional strategy performed by Information Retrieval (IR) systems is ranked keyword search: For a given query, a list of documents, ordered by relevance, is returned. Releva...
Pierpaolo Basile, Annalina Caputo, Anna Lisa Genti...
We investigates language models for informational and navigational web search. Retrieval on the web is a task that differs substantially from ordinary ad hoc retrieval. We perfor...
We describe our participation in the 2009 CLEF-IP task, which was targeted at priorart search for topic patent documents. Our system retrieved patent documents based on a standard...
For our fourth participation in the CLEF evaluation campaigns, our first objective was to propose an effective and general stopword list and a light stemming procedure for the Port...