Microformats and semantic XHTML add semantics to web pages while taking advantage of the existing (X)HTML infrastructure. This approach enables new applications that can be deploy...
This paper pursues the recently emerging paradigm of searching for entities that are embedded in Web pages. We utilize informationextraction techniques to identify entity candidat...
Julia Stoyanovich, Srikanta J. Bedathur, Klaus Ber...
We describe ongoing research on segmenting and labeling HTML medical journal articles. In contrast to existing approaches in which HTML tags usually serve as strong indicators, we...
This paper reports some experiments in using SVG (Scalable Vector Graphics), rather than the browser default of (X)HTML/CSS, as a potential Web-based rendering technology, in an a...
In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...