Sciweavers

3716 search results - page 702 / 744
» On the monotonization of the training set
Sort
View
KDD
2009
ACM
156views Data Mining» more  KDD 2009»
16 years 1 months ago
Effective multi-label active learning for text classification
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Ch...
99
Voted
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
16 years 1 months ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
104
Voted
KDD
2009
ACM
232views Data Mining» more  KDD 2009»
16 years 1 months ago
Classification of software behaviors for failure detection: a discriminative pattern mining approach
Software is a ubiquitous component of our daily life. We often depend on the correct working of software systems. Due to the difficulty and complexity of software systems, bugs an...
David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo,...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
16 years 1 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
KDD
2008
ACM
259views Data Mining» more  KDD 2008»
16 years 1 months ago
Using ghost edges for classification in sparsely labeled networks
We address the problem of classification in partially labeled networks (a.k.a. within-network classification) where observed class labels are sparse. Techniques for statistical re...
Brian Gallagher, Hanghang Tong, Tina Eliassi-Rad, ...