The Perseus Project at Tufts University produces tools to enhance the study of humanities texts. Perseus’ new named-entity browser lets users browse an index of references to pe...
We focus on the task of target detection in automatic link generation with Wikipedia, i.e., given an N-gram in a snippet of text, find the relevant Wikipedia concepts that explai...
We address the task of separating personal from non-personal blogs, and report on a set of baseline experiments where we compare the performance on a small set of features across ...
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
Peer-to-peer Data Networks (PDNs) are large-scale, selforganizing, distributed query processing systems. Familiar examples of PDN are peer-to-peer file-sharing networks, which su...