The min-sum k-clustering problem is to partition a metric space (P, d) into k clusters C1, . . . , Ck ⊆ P such that k i=1 p,q∈Ci d(p, q) is minimized. We show the first effi...
Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance of web search. In this paper, we explore the use of page segmentati...
Complex graphs, in which multi-type nodes are linked to each other, frequently arise in many important applications, such as Web mining, information retrieval, bioinformatics, and...
Bo Long, Zhongfei (Mark) Zhang, Philip S. Yu, Tian...
Virtualization has been rapidly expanding its applications in numerous server and desktop environments to improve the utilization and manageability of physical systems. Such prolif...
Hadoop has become an attractive platform for large-scale data analytics. In this paper, we identify a major performance bottleneck of Hadoop: its lack of ability to colocate relat...