Sciweavers

42
Voted
ICDCS
2002
IEEE

A Fully Distributed Framework for Cost-Sensitive Data Mining

14 years 3 months ago
A Fully Distributed Framework for Cost-Sensitive Data Mining
Data mining systems aim to discover patterns and extract useful information from facts recorded in databases. A widely adopted approach is to apply machine learning algorithms to compute descriptive models or classifiers from the available data. Two of the main challenges in this area are that i) databases are large and possibly physically distributed, and ii) data are cost-sensitive, or examples in the databases usually have different prices or benefits (such as charity donation amount) that require an effective model to be more accurate towards examples with higher benefits. Here, we explore the development of techniques that address both issues to scale up cost-sensitive data mining. One naive approach for distributed data mining is a centralized system that ships all available data from different sites onto a single site to learn a global model. Besides its obvious communication overhead, this approach is ineffective due to many practical concerns. The second approach is a part...
Wei Fan, Haixun Wang, Philip S. Yu, Salvatore J. S
Added 14 Jul 2010
Updated 14 Jul 2010
Type Conference
Year 2002
Where ICDCS
Authors Wei Fan, Haixun Wang, Philip S. Yu, Salvatore J. Stolfo
Comments (0)