Search Sciweavers | Sciweavers

152

ECIR
2008
Springer

103views Information Technology» more ECIR 2008»

Semi-supervised Document Classification with a Mislabeling Error Model

15 years 5 months ago

Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...

Anastasia Krithara, Massih-Reza Amini, Jean-Michel...

claim paper

Read More »

152

click to vote

NIPS
2000

155views Information Technology» more NIPS 2000»

The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity

15 years 5 months ago

Download www.cs.cmu.edu

We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...

David A. Cohn, Thomas Hofmann

claim paper

Read More »

140

click to vote

RIAO
2004

157views Information Technology» more RIAO 2004»

Multilingual document clusters discovery

15 years 5 months ago

Download www-list.cea.fr

Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on th...

Benoît Mathieu, Romaric Besançon, Chr...

claim paper

Read More »

128

click to vote

COLING
2002

94views Computational Linguistics» more COLING 2002»

Effective Structural Inference for Large XML Documents

15 years 4 months ago

Download acl.ldc.upenn.edu

This paper investigates methods to automatically infer structural information from large XML documents. Using XML as a reference format, we approach the schema generation problem ...

Jason Sankey, Raymond K. Wong

claim paper

Read More »

182

Voted

ICIP
2001
IEEE

195views Image Processing» more ICIP 2001»

Restoration of images scanned from thick bound documents

16 years 5 months ago

Download www.comp.nus.edu.sg

Perspective distortion always occurs while scanning thick, bound documents. This distortion mainly causes two sources of degradation for the scanned grayscale image ? i) shade alo...

Zheng Zhang 0003, Chew Lim Tan

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers