Sciweavers

637 search results - page 29 / 128
» Training and documentation
Sort
View
ECIR
2007
Springer
14 years 11 months ago
Entropy-Based Authorship Search in Large Document Collections
The purpose of authorship search is to identify documents written by a particular author or in a particular style in large document collections. Standard search engines match docum...
Ying Zhao, Justin Zobel
ICML
2006
IEEE
15 years 10 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
ICDAR
2009
IEEE
15 years 4 months ago
Spatial and Spectral Based Segmentation of Text in Multispectral Images of Ancient Documents
In this paper we propose a character segmentation method for multispectral images of ancient documents. Due to the low quality of the images the main idea of this study is to comb...
Martin Lettner, Robert Sablatnig
ICDAR
2009
IEEE
15 years 4 months ago
Finding Images and Line-Drawings in Document-Scanning Systems
The system presented in this paper finds images and line-drawings in scanned pages; it is a crucial processing step in the creation of a large-scale system to detect and index ima...
Shumeet Baluja, Michele Covell
DOCENG
2004
ACM
15 years 3 months ago
Supervised learning for the legacy document conversion
We consider the problem of document conversion from the renderingoriented HTML markup into a semantic-oriented XML annotation defined by user-specific DTDs or XML Schema descrip...
Boris Chidlovskii, Jérôme Fuselier