Sciweavers

180 search results - page 18 / 36
» Iterated Document Content Classification
Sort
View
SIGMOD
2004
ACM
150views Database» more  SIGMOD 2004»
16 years 3 months ago
When one Sample is not Enough: Improving Text Database Selection Using Shrinkage
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
Panagiotis G. Ipeirotis, Luis Gravano
ICPR
2008
IEEE
16 years 4 months ago
Generic scale-space process for handwriting documents analysis
This paper presents a generic architecture for handwriting documents analysis. It covers all analysis steps from the content description of the document (layout analysis, handwrit...
Guillaume Joutel, Hubert Emptoz, Véronique ...
ICADL
2007
Springer
132views Education» more  ICADL 2007»
15 years 9 months ago
On Building a Full-Text Digital Library of Historical Documents
The National Taiwan University Library has built a digital library of historical documents about Taiwan. The content is unique in that it covers about 80% of all primary Chinese hi...
Szu-Pei Chen, Jieh Hsiang, Hsieh-Chang Tu, Micha W...
JIIS
2002
168views more  JIIS 2002»
15 years 2 months ago
Hidden Markov Models for Text Categorization in Multi-Page Documents
In the traditional setting, text categorization is formulated as a concept learning problem where each instance is a single isolated document. However, this perspective is not appr...
Paolo Frasconi, Giovanni Soda, Alessandro Vullo
ECIR
2006
Springer
15 years 4 months ago
Automatic Document Organization in a P2P Environment
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
Stefan Siersdorfer, Sergej Sizov