This paper reports on the underlying IR problems encountered when indexing and searching with the Bulgarian language. For this language we propose a general light stemmer and demon...
Background: A fundamental goal of the U.S. National Institute of Health (NIH) "Roadmap" is to strengthen Translational Research, defined as the movement of discoveries i...
Alan Ruttenberg, Tim Clark, William Bug, Matthias ...
The ranking function used by search engines to order results is learned from labeled training data. Each training point is a (query, URL) pair that is labeled by a human judge who...
Rakesh Agrawal, Alan Halverson, Krishnaram Kenthap...
An under-explored question in cross-language information retrieval (CLIR) is to what degree the performance of CLIR methods depends on the availability of high-quality translation...
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...