Search Sciweavers | Sciweavers

367 search results - page 31 / 74

» Indexing Text Documents Based on Topic Identification

145

Voted

ICDE
2007
IEEE

211views Database» more ICDE 2007»

Document Representation and Dimension Reduction for Text Clustering

15 years 10 months ago

Download torch.cs.dal.ca

Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...

M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...

claim paper

Read More »

188

click to vote

CIKM
2005
Springer

160views Information Technology» more CIKM 2005»

Fast on-line index construction by geometric partitioning

15 years 5 months ago

Download goanna.cs.rmit.edu.au

Inverted index structures are the mainstay of modern text retrieval systems. They can be constructed quickly using off-line mergebased methods, and provide efﬁcient support for ...

Nicholas Lester, Alistair Moffat, Justin Zobel

claim paper

Read More »

134

click to vote

SIGIR
1999
ACM

153views Information Technology» more SIGIR 1999»

Probabilistic Latent Semantic Indexing

15 years 8 months ago

Download www.cs.brown.edu

Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fit...

Thomas Hofmann

claim paper

Read More »

134

click to vote

DAS
2006
Springer

128views Document Analysis» more DAS 2006»

Writer Identification for Smart Meeting Room Systems

15 years 7 months ago

Download www.dfki.uni-kl.de

Abstract. In this paper we present a text independent on-line writer identification system based on Gaussian Mixture Models (GMMs). This system has been developed in the context of...

Marcus Liwicki, Andreas Schlapbach, Horst Bunke, S...

claim paper

Read More »

144

click to vote

ICDAR
1997
IEEE

143views Document Analysis» more ICDAR 1997»

Representing OCRed documents in HTML

15 years 8 months ago

Download www.cedar.buffalo.edu

ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...

Tao Hong, Sargur N. Srihari

claim paper

Read More »

« Prev « First page 31 / 74 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers