Sciweavers

285 search results - page 34 / 57
» Ontology-based Text Document Clustering
Sort
View
100
Voted
DOCENG
2010
ACM
15 years 1 months ago
Picture detection in document page images
We present a method for picture detection in document page images, which can come from scanned or camera images, or rendered from electronic file formats. Our method uses OCR to s...
Patrick Chiu, Francine Chen, Laurent Denoue
101
Voted
EMNLP
2010
14 years 10 months ago
Evaluating Models of Latent Document Semantics in the Presence of OCR Errors
Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...
111
Voted
ICDAR
2011
IEEE
14 years 1 days ago
A Chinese Character Localization Method Based on Intergrating Structure and CC-Clustering for Advertising Images
—In this paper, a novel Chinese character localization method is proposed for texts in advertising images. To deal with the texts with gradient color, a color clustering method b...
Jie Liu, Shuwu Zhang, Heping Li, Wei Liang
ICPR
2008
IEEE
15 years 6 months ago
Stop word detection in compressed textual images: An experiment on indic script documents
Stop word detection is attempted in this work in the context of retrieval of document images in the compressed domain. Algorithms are presented to identify text lines and words an...
Utpal Garain, Amit Kumar Das
108
Voted
ICDM
2007
IEEE
129views Data Mining» more  ICDM 2007»
15 years 6 months ago
Semi-supervised Clustering Using Bayesian Regularization
Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
Zuobing Xu, Ram Akella, Mike Ching, Renjie Tang