Sciweavers

285 search results - page 24 / 57
» Ontology-based Text Document Clustering
Sort
View
ICML
2003
IEEE
15 years 5 months ago
An Evaluation on Feature Selection for Text Clustering
Feature selection methods have been successfully applied to text categorization but seldom applied to text clustering due to the unavailability of class label information. In this...
Tao Liu, Shengping Liu, Zheng Chen, Wei-Ying Ma
SIGIR
2002
ACM
15 years 3 days ago
Unsupervised document classification using sequential information maximization
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
Noam Slonim, Nir Friedman, Naftali Tishby
121
Voted
SIGIR
2008
ACM
15 years 11 days ago
Enhancing text clustering by leveraging Wikipedia semantics
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...
126
Voted
SDM
2007
SIAM
187views Data Mining» more  SDM 2007»
15 years 1 months ago
Topic Models over Text Streams: A Study of Batch and Online Unsupervised Learning
Topic modeling techniques have widespread use in text data mining applications. Some applications use batch models, which perform clustering on the document collection in aggregat...
Arindam Banerjee, Sugato Basu
90
Voted
DEXAW
2008
IEEE
123views Database» more  DEXAW 2008»
15 years 7 months ago
Text Extraction from the Web via Text-to-Tag Ratio
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
Tim Weninger, William H. Hsu