Sciweavers

2827 search results - page 211 / 566
» Marking Text Documents
Sort
View
ICDAR
2009
IEEE
15 years 3 months ago
Italic or Roman: Word Style Recognition without A Priori Knowledge for Old Printed Documents
This paper presents an Italic/Roman word type recognition system without a priori knowledge on the characters' font. This method aims at analyzing old documents in which char...
Loris Eynard, Hubert Emptoz
ACL
2012
13 years 8 months ago
Labeling Documents with Timestamps: Learning from their Time Expressions
Temporal reasoners for document understanding typically assume that a document’s creation date is known. Algorithms to ground relative time expressions and order events often re...
Nathanael Chambers
147
Voted
ICML
2006
IEEE
16 years 6 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
WWW
2006
ACM
16 years 6 months ago
Visually guided bottom-up table detection and segmentation in web documents
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Bernhard Krüpl, Marcus Herzog
ICDAR
2009
IEEE
16 years 24 days ago
Generic Feature Selection and Document Processing
This paper presents a generic features selection method and its applications on some document analysis problems. The method is based on a genetic algorithm (GA), whose tness funct...
Hassan Chouaib, Nicole Vincent, Florence Cloppet, ...