Sciweavers

1950 search results - page 1 / 390
» Informative sampling for large unbalanced data sets
Sort
View
GECCO
2008
Springer
137views Optimization» more  GECCO 2008»
13 years 5 months ago
Informative sampling for large unbalanced data sets
Selective sampling is a form of active learning which can reduce the cost of training by only drawing informative data points into the training set. This selected training set is ...
Zhenyu Lu, Anand I. Rughani, Bruce I. Tranmer, Jos...
DMIN
2007
186views Data Mining» more  DMIN 2007»
13 years 6 months ago
Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error Costs?
- The classifier built from a data set with a highly skewed class distribution generally predicts the more frequently occurring classes much more often than the infrequently occurr...
Gary M. Weiss, Kate McCarthy, Bibi Zabar
BMCBI
2006
86views more  BMCBI 2006»
13 years 4 months ago
The impact of sample imbalance on identifying differentially expressed genes
Background: Recently several statistical methods have been proposed to identify genes with differential expression between two conditions. However, very few studies consider the p...
Kun Yang, Jianzhong Li, Hong Gao
KDD
2002
ACM
138views Data Mining» more  KDD 2002»
14 years 4 months ago
Learning to match and cluster large high-dimensional data sets for data integration
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
William W. Cohen, Jacob Richman
CORR
2010
Springer
138views Education» more  CORR 2010»
13 years 1 months ago
Rules of Thumb for Information Acquisition from Large and Redundant Data
We develop an abstract model of information acquisition from redundant data. We assume a random sampling process from data which contain information with bias and are interested in...
Wolfgang Gatterbauer