Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting informatio...
There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and ...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Web sites, Web pages and the data on pages are available only for specific periods of time and are deleted afterwards from a client’s point of view. An important task in order t...
This paper presents a method for generating indexable and browsable keyword metadata from ASR transcripts by leveraging the Web. Search engine queries are built from an ASR transc...
Kishan Thambiratnam, Gang Li, Sha Meng, Frank Seid...