We offer the first large-scale analysis of Web traffic based on network flow data. Using data collected on the Internet2 network, we constructed a weighted bipartite clientserver ...
Mark Meiss, Filippo Menczer, Alessandro Vespignani
Accurately and efficiently estimating the number of distinct values for some attribute(s) or sets of attributes in a data set is of critical importance to many database operation...
The goal of this work is to study the feasibility of a Heterogeneous Data Classification and Search (HDCS) system and to provide a possible design for its implementing. In order t...
Dorin Carstoiu, Alexandra Cernian, Adriana Olteanu...
Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and mach...
Dingding Wang, Tao Li, Shenghuo Zhu, Chris H. Q. D...
In this paper, we introduce a system named Argo which provides intelligent advertising made possible from users’ photo collections. Based on the intuition that user-generated ph...
Xin-Jing Wang, Mo Yu, Lei Zhang, Rui Cai, Wei-Ying...