Sciweavers

1950 search results - page 71 / 390
» Informative sampling for large unbalanced data sets
Sort
View
WSDM
2010
ACM
315views Data Mining» more  WSDM 2010»
15 years 11 months ago
SBotMiner: Large Scale Search Bot Detection
In this paper, we study search bot traffic from search engine query logs at a large scale. Although bots that generate search traffic aggressively can be easily detected, a large ...
Fang Yu, Yinglian Xie, Qifa Ke
ER
2007
Springer
99views Database» more  ER 2007»
15 years 7 months ago
Capturing Users' Everyday, Implicit Information Integration Decisions
Integration of large databases by expert teams is only a small part of the data integration activities that take place. Users without data integration expertise very often gather,...
David W. Archer, Lois M. L. Delcambre
126
Voted
WWW
2009
ACM
16 years 2 months ago
Matchbox: large scale online bayesian recommendations
We present a probabilistic model for generating personalised recommendations of items to users of a web service. The Matchbox system makes use of content information in the form o...
David H. Stern, Ralf Herbrich, Thore Graepel
KDD
2007
ACM
191views Data Mining» more  KDD 2007»
16 years 1 months ago
Modeling relationships at multiple scales to improve accuracy of large recommender systems
The collaborative filtering approach to recommender systems predicts user preferences for products or services by learning past useritem relationships. In this work, we propose no...
Robert M. Bell, Yehuda Koren, Chris Volinsky
BMCBI
2008
95views more  BMCBI 2008»
15 years 1 months ago
Unsupervised reduction of random noise in complex data by a row-specific, sorted principal component-guided method
Background: Large biological data sets, such as expression profiles, benefit from reduction of random noise. Principal component (PC) analysis has been used for this purpose, but ...
Joseph W. Foley, Fumiaki Katagiri