Sciweavers

CLEF
2009
Springer

Batch Document Filtering Using Nearest Neighbor Algorithm

13 years 5 months ago
Batch Document Filtering Using Nearest Neighbor Algorithm
This paper describes the participation of LIG lab, in the batch filtering task for the INFILE (INformation FILtering Evaluation) campaign of CLEF 2009. As opposed to the online task, where the server provides the documents one by one, all of the documents are provided beforehand in the batch task, which explains the fact that feedback is not possible in the batch task. We propose in this paper a batch algorithm to learn category specific thresholds in a multiclass environment where a document can belong to more than one class. The algorithm uses k-nearest neighbor algorithm for filtering the 100,000 documents into 50 topics. The experiments were run on the English corpus. Our experiments gave us a precision of 0.256 while the recall was 0.295. We had participated in the online task in INFILE 2008 where we had used an online algorithm using the feedbacks from the server. In comparison with INFILE 2008, the recall is significantly better in 2009, 0.295 vs 0.260. However the precision in...
Ali Mustafa Qamar, Éric Gaussier, Nathalie
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2009
Where CLEF
Authors Ali Mustafa Qamar, Éric Gaussier, Nathalie Denos
Comments (0)