What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
We present the results of a community detection analysis of the Wikipedia graph. Distinct communities in Wikipedia contain semantically closely related articles. The central topic...
We present a new bicriteria approximation algorithm for the degree-bounded minimum-cost spanning tree problem: Given an undirected graph with nonnegative edge weights and degree b...
In this study we propose sketching algorithms for computing similarities between hierarchical data. Specifically, we look at data objects that are represented using leaf-labeled t...
We describe a method for finding ungapped conserved words in rRNA sequences that is effective, utilizes evolutionary information and does not depend on multiple sequence alignment...
Liaofu Luo, Li-Ching Hsieh, Fengmin Ji, Mengwen Ji...