A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product in...
We are working on a project aimed at building next generation analyst support tools that focus analysts’ attention on the most critical and novel information found within the da...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
: Citations management is an important task in managing digital libraries. Citations provide valuable information e.g., used in evaluating an author's influences or scholarly ...
Muhammad Tanvir Afzal, Hermann A. Maurer, Wolf-Til...
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...