Sciweavers

637 search results - page 9 / 128
» Training and documentation
Sort
View
ICML
1998
IEEE
15 years 10 months ago
Employing EM and Pool-Based Active Learning for Text Classification
This paper shows how a text classifier's need for labeled training documents can be reduced by taking advantage of a large pool of unlabeled documents. We modify the Query-by...
Andrew McCallum, Kamal Nigam
ACL
2011
14 years 1 months ago
From Bilingual Dictionaries to Interlingual Document Representations
Mapping documents into an interlingual representation can help bridge the language barrier of a cross-lingual corpus. Previous approaches use aligned documents as training data to...
Jagadeesh Jagarlamudi, Hal Daumé III, Ragha...
ICDAR
2009
IEEE
14 years 7 months ago
Learning on the Fly: Font-Free Approaches to Difficult OCR Problems
Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as docu...
Andrew Kae, Erik G. Learned-Miller
96
Voted
MM
2006
ACM
166views Multimedia» more  MM 2006»
15 years 3 months ago
Automatic document orientation detection and categorization through document vectorization
This paper presents an automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and ...
Shijian Lu, Chew Lim Tan
98
Voted
COLING
2010
14 years 4 months ago
Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization
Cross Document Coreference (CDC) is the task of constructing the coreference chain for mentions of a person across a set of documents. This work offers a holistic view of using do...
Jian Huang 0002, Pucktada Treeratpituk, Sarah M. T...