Sciweavers

1085 search results - page 115 / 217
» Active Mining in a Distributed Setting
Sort
View
SDM
2004
SIAM
162views Data Mining» more  SDM 2004»
15 years 1 months ago
Subspace Clustering of High Dimensional Data
Clustering suffers from the curse of dimensionality, and similarity functions that use all input features with equal relevance may not be effective. We introduce an algorithm that...
Carlotta Domeniconi, Dimitris Papadopoulos, Dimitr...
123
Voted
CIKM
2010
Springer
14 years 11 months ago
Understanding retweeting behaviors in social networks
Retweeting is an important action (behavior) on Twitter, indicating the behavior that users re-post microblogs of their friends. While much work has been conducted for mining text...
Zi Yang, Jingyi Guo, Keke Cai, Jie Tang, Juanzi Li...
GROUP
2005
ACM
15 years 6 months ago
Seeking the source: software source code as a social and technical artifact
In distributed software development, two sorts of dependencies can arise. The structure of the software system itself can create dependencies between software elements, while the ...
Cleidson R. B. de Souza, Jon Froehlich, Paul Douri...
85
Voted
AUSDM
2006
Springer
112views Data Mining» more  AUSDM 2006»
15 years 4 months ago
Accuracy Estimation With Clustered Dataset
If the dataset available to machine learning results from cluster sampling (e.g. patients from a sample of hospital wards), the usual cross-validation error rate estimate can lead...
Ricco Rakotomalala, Jean-Hugues Chauchat, Fran&cce...
ICDM
2008
IEEE
182views Data Mining» more  ICDM 2008»
15 years 7 months ago
Multiple-Instance Regression with Structured Data
We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) op...
Kiri L. Wagstaff, Terran Lane, Alex Roper