Sciweavers

4085 search results - page 433 / 817
» Benchmarking Data Mining Algorithms
Sort
View
ICDM
2007
IEEE
184views Data Mining» more  ICDM 2007»
15 years 11 months ago
Bayesian Folding-In with Dirichlet Kernels for PLSI
Probabilistic latent semantic indexing (PLSI) represents documents of a collection as mixture proportions of latent topics, which are learned from the collection by an expectation...
Alexander Hinneburg, Hans-Henning Gabriel, Andr&eg...
ICDM
2006
IEEE
127views Data Mining» more  ICDM 2006»
15 years 11 months ago
Optimal k-Anonymity with Flexible Generalization Schemes through Bottom-up Searching
In recent years, a major thread of research on kanonymity has focused on developing more flexible generalization schemes that produce higher-quality datasets. In this paper we in...
Tiancheng Li, Ninghui Li
ADMA
2006
Springer
131views Data Mining» more  ADMA 2006»
15 years 11 months ago
Distance Guided Classification with Gene Expression Programming
Gene Expression Programming (GEP) aims at discovering essential rules hidden in observed data and expressing them mathematically. GEP has been proved to be a powerful tool for cons...
Lei Duan, Changjie Tang, Tianqing Zhang, Dagang We...
ICDM
2005
IEEE
146views Data Mining» more  ICDM 2005»
15 years 10 months ago
Merging Interface Schemas on the Deep Web via Clustering Aggregation
We consider the problem of integrating a large number of interface schemas over the Deep Web, The scale of the problem and the diversity of the sources present serious challenges ...
Wensheng Wu, AnHai Doan, Clement T. Yu
137
Voted
KDD
1995
ACM
140views Data Mining» more  KDD 1995»
15 years 8 months ago
Decision Tree Induction: How Effective is the Greedy Heuristic?
Mostexisting decision tree systemsuse a greedyapproachto inducetrees -- locally optimalsplits are inducedat every node of the tree. Althoughthe greedy approachis suboptimal,it is ...
Sreerama K. Murthy, Steven Salzberg