Sciweavers

ICDE
2008
IEEE

RAD: A Scalable Framework for Annotator Development

14 years 6 months ago
RAD: A Scalable Framework for Annotator Development
Developments in semantic search technology have motivated the need for efficient and scalable entity annotation techniques. We demonstrate RAD: a tool for Rapid Annotator Development on a document collection. RAD builds on a recent approach [1] that translates entity annotation rules into equivalent operations on the inverted index of the collection, to directly generate an annotation index (which can be used in search applications). To make the framework scalable, we use an industrial strength indexer, Lucene [2] and introduce some modifications to its API. The index also serves as a suitable representation for making quick comparisons with an indexed ground truth of annotations on the same collection to evaluate precision and recall of the annotations. RAD achieves at least an order of magnitude speedup over the standard approach of annotating a document-at-a-time as adopted by GATE [3]. The speedup factor increases with increase in the size of the collection, making RAD scalable. We...
Sanjeet Khaitan, Ganesh Ramakrishnan, Sachindra Jo
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2008
Where ICDE
Authors Sanjeet Khaitan, Ganesh Ramakrishnan, Sachindra Joshi, Anup Chalamalla
Comments (0)