Text categorization and retrieval tasks are often based on a good representation of textual data. Departing from the classical vector space model, several probabilistic models have...
We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets ar...
Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee...
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a pr...