We present a novel approach for classifying documents that combines different pieces of evidence (e.g., textual features of documents, links, and citations) transparently, through...
Adriano Veloso, Wagner Meira Jr., Marco Cristo, Ma...
Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content catego...
Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann
Recently, spammers have proliferated "image spam", emails which contain the text of the spam message in a human readable image instead of the message body, making detect...
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a pr...
The huge volumes of biomedical texts available online drives the increasing need for automated techniques to analyze and extract knowledge from these repositories of information. ...