Sciweavers

538 search results - page 47 / 108
» Mining Relevant Text from Unlabelled Documents
Sort
View
ITCC
2005
IEEE
15 years 3 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang
ICDM
2003
IEEE
119views Data Mining» more  ICDM 2003»
15 years 3 months ago
A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering
Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distrib...
Chihli Hung, Stefan Wermter
ICDM
2002
IEEE
191views Data Mining» more  ICDM 2002»
15 years 2 months ago
Iterative Clustering of High Dimensional Text Data Augmented by Local Search
The k-means algorithm with cosine similarity, also known as the spherical k-means algorithm, is a popular method for clustering document collections. However, spherical k-means ca...
Inderjit S. Dhillon, Yuqiang Guan, J. Kogan
SDM
2008
SIAM
133views Data Mining» more  SDM 2008»
14 years 11 months ago
Semantic Smoothing for Bayesian Text Classification with Small Training Data
Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
RIAO
2000
14 years 11 months ago
Discovering and Comparing Topic Hierarchies
Hierarchies have been used for organization, summarization, and access to information, yet a lingering issue is how best to construct them. In this paper, our goal is to automatic...
Dawn Lawrie, W. Bruce Croft