Sciweavers

2926 search results - page 178 / 586
» Document Analysis
Sort
View
ECIR
2008
Springer
15 years 5 months ago
Semi-supervised Document Classification with a Mislabeling Error Model
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
Anastasia Krithara, Massih-Reza Amini, Jean-Michel...
NIPS
2000
15 years 5 months ago
The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity
We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...
David A. Cohn, Thomas Hofmann
RIAO
2004
15 years 5 months ago
Multilingual document clusters discovery
Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on th...
Benoît Mathieu, Romaric Besançon, Chr...
COLING
2002
15 years 4 months ago
Effective Structural Inference for Large XML Documents
This paper investigates methods to automatically infer structural information from large XML documents. Using XML as a reference format, we approach the schema generation problem ...
Jason Sankey, Raymond K. Wong
182
Voted
ICIP
2001
IEEE
16 years 5 months ago
Restoration of images scanned from thick bound documents
Perspective distortion always occurs while scanning thick, bound documents. This distortion mainly causes two sources of degradation for the scanned grayscale image ? i) shade alo...
Zheng Zhang 0003, Chew Lim Tan