Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...
We consider the problem of retrieving multiple documents relevant to the single subtopics of a given web query, termed “full-subtopic retrieval”. To solve this problem we pres...
Andrea Bernardini, Claudio Carpineto, Massimiliano...
Information retrieval systems conventionally assess document relevance using the bag of words model. Consequently, relevance scores of documents retrieved for different queries a...
Deepak Agarwal, Evgeniy Gabrilovich, Robert Hall, ...
In this paper, we target on the problem of personal name disambiguation in search results returned by personal name queries. Usually, a personal name refers to several people. The...