Minimal document set retrieval

11 years 6 months ago
Minimal document set retrieval
This paper presents a novel formulation and approach to the minimal document set retrieval problem. Minimal Document Set Retrieval (MDSR) is a promising information retrieval task in which each query topic is assumed to have different subtopics; the task is to retrieve and rank relevant document sets with maximum coverage but minimum redundancy of subtopics in each set. For this task, we propose three document set retrieval and ranking algorithms: Novelty Based method, Cluster Based method and Subtopic Extraction Based method. In order to evaluate the system performance, we design a new evaluation framework for document set ranking which evaluates both relevance between set and query topic, and redundancy within each set. Finally, we compare the performance of the three algorithms using the TREC interactive track dataset. Experimental results show the effectiveness of our algorithms. Categories and Subject Descriptors H.3.3 [Information Search and Retrieval]: Retrieval Models – se...
Wei Dai, Rohini K. Srihari
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where CIKM
Authors Wei Dai, Rohini K. Srihari
Comments (0)