Sciweavers

637 search results - page 15 / 128
» Training and documentation
Sort
View
ML
2000
ACM
124views Machine Learning» more  ML 2000»
14 years 9 months ago
Text Classification from Labeled and Unlabeled Documents using EM
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
ICASSP
2011
IEEE
14 years 1 months ago
Using latent topic features to improve binary classification of spoken documents
In many topic identification applications, supervised training labels are indirectly related to the semantic content of the documents being classified. For example, many topical...
Jonathan Wintrode
IRAL
2000
ACM
15 years 2 months ago
Content-based language models for spoken document retrieval
Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper...
Hsin-Min Wang, Berlin Chen
ICDAR
2007
IEEE
15 years 3 months ago
Content-level Annotation of Large Collection of Printed Document Images
A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, ...
Anand Kumar 0002, C. V. Jawahar
JUCS
2008
130views more  JUCS 2008»
14 years 9 months ago
Feature Selection for the Classification of Large Document Collections
: Feature selection methods are often applied in the context of document classification. They are particularly important for processing large data sets that may contain millions of...
Janez Brank, Dunja Mladenic, Marko Grobelnik, Nata...