Sciweavers

139 search results - page 10 / 28
» An Empirical Comparison of Four Text Mining Methods
Sort
View
ICML
2003
IEEE
16 years 14 days ago
Text Bundling: Statistics Based Data-Reduction
As text corpora become larger, tradeoffs between speed and accuracy become critical: slow but accurate methods may not complete in a practical amount of time. In order to make the...
Lawrence Shih, Jason D. Rennie, Yu-Han Chang, Davi...
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
16 years 6 days ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
ICDM
2007
IEEE
170views Data Mining» more  ICDM 2007»
15 years 6 months ago
Consensus Clusterings
In this paper we address the problem of combining multiple clusterings without access to the underlying features of the data. This process is known in the literature as clustering...
Nam Nguyen, Rich Caruana
ICDM
2003
IEEE
181views Data Mining» more  ICDM 2003»
15 years 5 months ago
Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift
Algorithms for tracking concept drift are important for many applications. We present a general method based on the Weighted Majority algorithm for using any online learner for co...
Jeremy Z. Kolter, Marcus A. Maloof
DMIN
2007
226views Data Mining» more  DMIN 2007»
15 years 1 months ago
Generative Oversampling for Mining Imbalanced Datasets
— One way to handle data mining problems where class prior probabilities and/or misclassification costs between classes are highly unequal is to resample the data until a new, d...
Alexander Liu, Joydeep Ghosh, Cheryl Martin