Prior Art Retrieval Using the Claims Section as a Bag of Words

13 years 5 months ago

Download www.clef-campaign.org

We describe our participation in the 2009 CLEF-IP task, which was targeted at priorart search for topic patent documents. Our system retrieved patent documents based on a standard bag-of-words approach for both the Main Task and the English Task. In both runs, we extracted the claim sections from all English patents in the corpus and saved them in the Lemur index format with the patent IDs as DOCIDs. These claims were then indexed using Lemur's BuildIndex function. In the topic documents we also focussed exclusively on the claims sections. These were extracted and converted to queries by removing stopwords and punctuation. We did not perform any term selection. We retrieved 100 patents per topic using Lemur's RetEval function, retrieval model TF-IDF. Compared to the other runs submitted for the track, we obtained good results in terms of nDCG (0.46) and moderate results in terms of MAP (0.054). Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.1...

Suzan Verberne, Eva D'hondt

Real-time Traffic

CLEF 2009 | Information Technology | Lemur's BuildIndex Function | Patent Documents | Topic Patent Documents |

claim paper

Added	08 Nov 2010
Updated	08 Nov 2010
Type	Conference
Year	2009
Where	CLEF
Authors	Suzan Verberne, Eva D'hondt

Sciweavers

Prior Art Retrieval Using the Claims Section as a Bag of Words

CLEF 2009 | Information Technology | Lemur's BuildIndex Function | Patent Documents | Topic Patent Documents |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers