Information retrieval algorithms leverage various collection statistics to improve performance. Because these statistics are often computed on a relatively small evaluation corpus...
This paper proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigg...
One of the most important emerging developments for improving the user/computer interface has been the addition of multimedia facilities to high-performance workstations. Although...
The data volume of XML repositories and the response time of query processing have become critical issues for many applications, especially for those in the Web. An interesting alt...
—The goal of this work is to add the capability to segment documents containing text, graphics, and pictures in the open source OCR engine OCRopus. To achieve this goal, OCRopusâ...
Amy Winder, Tim L. Andersen, Elisa H. Barney Smith