Sciweavers

2827 search results - page 191 / 566
» Marking Text Documents
Sort
View
130
Voted
WWW
2008
ACM
16 years 5 months ago
Reasoning about similarity queries in text retrieval tasks
In many text retrieval tasks, it is highly desirable to obtain a "similarity profile" of the document collection for a given query. We propose sampling-based techniques ...
Xiaohui Yu, Yang Liu
ICML
2005
IEEE
16 years 5 months ago
A model for handling approximate, noisy or incomplete labeling in text classification
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties associated with the labeling process. Given a labeled or partially labeled training corpus of...
Ganesh Ramakrishnan, Krishna Prasad Chitrapura, Ra...
ICDM
2003
IEEE
119views Data Mining» more  ICDM 2003»
15 years 10 months ago
A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering
Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distrib...
Chihli Hung, Stefan Wermter
151
Voted
ICDM
2002
IEEE
191views Data Mining» more  ICDM 2002»
15 years 10 months ago
Iterative Clustering of High Dimensional Text Data Augmented by Local Search
The k-means algorithm with cosine similarity, also known as the spherical k-means algorithm, is a popular method for clustering document collections. However, spherical k-means ca...
Inderjit S. Dhillon, Yuqiang Guan, J. Kogan
DL
1999
Springer
181views Digital Library» more  DL 1999»
15 years 9 months ago
Quality of OCR for Degraded Text Images
Commercial OCR packages work best with highquality scanned images. They often produce poor results when the image is degraded, either because the original itself was poor quality,...
Roger T. Hartley, Kathleen Crumpton