Sciweavers

TREC
2007

On Retrieving Legal Files: Shortening Documents and Weeding Out Garbage

13 years 5 months ago
On Retrieving Legal Files: Shortening Documents and Weeding Out Garbage
This paper describes our participation in the TREC Legal experiments in 2007. We have applied novel normalization techniques that are designed to slightly favor longer documents instead of assuming that all documents should have equal weight. We have also developed a new method for reformulating query text when background information is provided with an information request. We have also experimented with using enhanced OCR error detection to reduce the size of the term list and remove noise in the data. In this article, we discuss the impact of these effects on the TREC 2007 data sets. We show that the use of simple normalization methods significantly outperforms cosine normalization in the legal domain.
Scott Kulp, April Kontostathis
Added 07 Nov 2010
Updated 07 Nov 2010
Type Conference
Year 2007
Where TREC
Authors Scott Kulp, April Kontostathis
Comments (0)