Probabilistic latent semantic indexing (PLSI) represents documents of a collection as mixture proportions of latent topics, which are learned from the collection by an expectation...
Alexander Hinneburg, Hans-Henning Gabriel, Andr&eg...
In recent years, a major thread of research on kanonymity has focused on developing more flexible generalization schemes that produce higher-quality datasets. In this paper we in...
Gene Expression Programming (GEP) aims at discovering essential rules hidden in observed data and expressing them mathematically. GEP has been proved to be a powerful tool for cons...
Lei Duan, Changjie Tang, Tianqing Zhang, Dagang We...
We consider the problem of integrating a large number of interface schemas over the Deep Web, The scale of the problem and the diversity of the sources present serious challenges ...
Mostexisting decision tree systemsuse a greedyapproachto inducetrees -- locally optimalsplits are inducedat every node of the tree. Althoughthe greedy approachis suboptimal,it is ...