The top-k similarity joins have been extensively studied and used
in a wide spectrum of applications such as information retrieval, decision
making, spatial data analysis and dat...
It is well known that Web-page classification can be enhanced by using hyperlinks that provide linkages between Web pages. However, in the Web space, hyperlinks are usually sparse...
Recently many data types arising from data mining and Web search applications can be modeled as bipartite graphs. Examples include queries and URLs in query logs, and authors and ...
Sampling is a popular method of data collection when it is impossible or too costly to reach the entire population. For example, television show ratings in the United States are g...
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...