In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality....
As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...
We describe a framework for automatically selecting a summary set of photographs from a large collection of geo-referenced photos. The summary algorithm is based on spatial patter...
Alexander Jaffe, Mor Naaman, Tamir Tassa, Marc Dav...
This paper presents a method for finding a specification page on the web for a given object (e.g., "Titanic") and its class label (e.g., "film"). A specificati...