Sciweavers

572 search results - page 91 / 115
» Estimating the Accuracy of Learned Concepts
Sort
View
ICML
2009
IEEE
15 years 10 months ago
Robust bounds for classification via selective sampling
We introduce a new algorithm for binary classification in the selective sampling protocol. Our algorithm uses Regularized Least Squares (RLS) as base classifier, and for this reas...
Nicolò Cesa-Bianchi, Claudio Gentile, Franc...
WWW
2008
ACM
15 years 10 months ago
Mining the search trails of surfing crowds: identifying relevant websites from user activity
The paper proposes identifying relevant information sources from the history of combined searching and browsing behavior of many Web users. While it has been previously shown that...
Mikhail Bilenko, Ryen W. White
76
Voted
KDD
2008
ACM
132views Data Mining» more  KDD 2008»
15 years 10 months ago
Partitioned logistic regression for spam filtering
Naive Bayes and logistic regression perform well in different regimes. While the former is a very simple generative model which is efficient to train and performs well empirically...
Ming-wei Chang, Wen-tau Yih, Christopher Meek
KDD
2003
ACM
214views Data Mining» more  KDD 2003»
15 years 10 months ago
Adaptive duplicate detection using learnable string similarity measures
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Mikhail Bilenko, Raymond J. Mooney
DATE
2009
IEEE
105views Hardware» more  DATE 2009»
15 years 4 months ago
Enrichment of limited training sets in machine-learning-based analog/RF test
Abstract— This paper discusses the generation of informationrich, arbitrarily-large synthetic data sets which can be used to (a) efficiently learn tests that correlate a set of ...
Haralampos-G. D. Stratigopoulos, Salvador Mir, Yio...