We consider the problem of segmenting a webpage into visually and semantically cohesive pieces. Our approach is based on formulating an appropriate optimization problem on weighte...
Given a huge online social network, how do we retrieve information from it through crawling? Even better, how do we improve the crawling performance by using parallel crawlers tha...
Duen Horng Chau, Shashank Pandit, Samuel Wang, Chr...
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
This paper is to investigate the group behavior patterns of search activities based on Web search history data, i.e., clickthrough data, to boost search performance. We propose a ...
This paper describes the eBag infrastructure, which is a generic infrastructure inspired from work with school children who could bene t from a electronic schoolbag for collaborat...
Christina Brodersen, Bent Guldbjerg Christensen, K...