Sciweavers

385 search results - page 36 / 77
» Improving data mining utility with projective sampling
Sort
View
NIPS
2007
14 years 11 months ago
Mining Internet-Scale Software Repositories
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
117
Voted
ICDE
2009
IEEE
173views Database» more  ICDE 2009»
14 years 7 months ago
Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database
There is a huge wealth of sequence data available, for example, customer purchase histories, program execution traces, DNA, and protein sequences. Analyzing this wealth of data to ...
Bolin Ding, David Lo, Jiawei Han, Siau-Cheng Khoo
PAKDD
2009
ACM
186views Data Mining» more  PAKDD 2009»
15 years 4 months ago
Pairwise Constrained Clustering for Sparse and High Dimensional Feature Spaces
Abstract. Clustering high dimensional data with sparse features is challenging because pairwise distances between data items are not informative in high dimensional space. To addre...
Su Yan, Hai Wang, Dongwon Lee, C. Lee Giles
ICDM
2010
IEEE
213views Data Mining» more  ICDM 2010»
14 years 7 months ago
Modeling Experts and Novices in Citizen Science Data for Species Distribution Modeling
Citizen scientists, who are volunteers from the community that participate as field assistants in scientific studies [3], enable research to be performed at much larger spatial and...
Jun Yu, Weng-Keen Wong, Rebecca A. Hutchinson
KDD
2002
ACM
179views Data Mining» more  KDD 2002»
15 years 10 months ago
From Data To Insight: The Community Of Multimedia Agents
Multimedia Data Mining requires the ability to automatically analyze and understand the content. The Community of Multimedia Agents project (COMMA) is devoted to creating an open ...
Gang Wei, Valery A. Petrushin, Anatole Gershman