Practical clustering algorithms require multiple data scans to achieve convergence. For large databases, these scans become prohibitively expensive. We present a scalable clusteri...
The development of technologies to address machine translation and distillation of multilingual broadcast data depends heavily on the collection of large volumes of material from ...
In many application domains, data is collected and referenced by its geo-spatial location. Spatial data mining, or the discovery of interesting patterns in such databases, is an i...
Real-world social networks are often hierarchical, reflecting the fact that some communities are composed of a few smaller, sub-communities. This paper describes a hierarchical B...
Haizheng Zhang, Wei Li, Xuerui Wang, C. Lee Giles,...
The clinical and basic science research domains present exciting and difficult data integration issues. Solving these problems is crucial as current research efforts in the field ...