Sciweavers

CORR
2010
Springer
99views Education» more  CORR 2010»
13 years 2 months ago
How I won the "Chess Ratings - Elo vs the Rest of the World" Competition
This article discusses in detail the rating system that won the kaggle competition "Chess Ratings: Elo vs the rest of the world". The competition provided a historical d...
Yannis Sismanis
ICCV
2009
IEEE
13 years 2 months ago
Incremental Multiple Kernel Learning for object recognition
A good training dataset, representative of the test images expected in a given application, is critical for ensuring good performance of a visual categorization system. Obtaining ...
Aniruddha Kembhavi, Behjat Siddiquie, Roland Miezi...
MASS
2010
157views Communications» more  MASS 2010»
13 years 2 months ago
Spatial extension of the Reality Mining Dataset
Data captured from a live cellular network with the real users during their common daily routine help to understand how the users move within the network. Unlike the simulations wi...
Michal Ficek, Lukas Kencl
ICML
2010
IEEE
13 years 3 months ago
Mining Clustering Dimensions
Many real-world datasets can be clustered along multiple dimensions. For example, text documents can be clustered not only by topic, but also by the author's gender or sentim...
Sajib Dasgupta, Vincent Ng
JASIS
2010
182views more  JASIS 2010»
13 years 3 months ago
Understanding latent semantic indexing: A topological structure analysis using Q-analysis
Abstract – The method of latent semantic indexing (LSI) is well known for tackling the synonymy and polysemy problems in information retrieval. However, its performance can be ve...
Dandan Li, Chung-Ping Kwong
SIGKDD
2000
95views more  SIGKDD 2000»
13 years 4 months ago
Scalability for Clustering Algorithms Revisited
This paper presents a simple new algorithm that performs k-means clustering in one scan of a dataset, while using a bu er for points from the dataset of xed size. Experiments show...
Fredrik Farnstrom, James Lewis, Charles Elkan
IJHPCA
2007
88views more  IJHPCA 2007»
13 years 5 months ago
Scaling Properties of Common Statistical Operators for Gridded Datasets
An accurate cost-model that accounts for dataset size and structure can help optimize geoscience data analysis. We develop and apply a computational model to estimate data analysi...
Charles S. Zender, Harry Mangalam
AAI
2008
93views more  AAI 2008»
13 years 5 months ago
Adaptive Machine Learning in Delayed Feedback Domains by Selective Relearning
We present a novel hybrid technique for improving the predictive performance of an online Machine Learning system: Combining advantages from both memory based and concept based pr...
Marcus-Christopher Ludl, Achim Lewandowski, Georg ...
COCOA
2008
Springer
13 years 6 months ago
Fixed-Parameter Tractability of Anonymizing Data by Suppressing Entries
A popular model for protecting privacy when person-specific data is released is k-anonymity. A dataset is k-anonymous if each record is identical to at least (k - 1) other records ...
Rhonda Chaytor, Patricia A. Evans, Todd Wareham
ADC
2006
Springer
125views Database» more  ADC 2006»
13 years 8 months ago
A reconstruction-based algorithm for classification rules hiding
Data sharing between two organizations is common in many application areas e.g. business planing or marketing. Useful global patterns can be discovered from the integrated dataset...
Juggapong Natwichai, Xue Li, Maria E. Orlowska