In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. Aft...
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
This paper presents a novel information system integrating advanced information extraction technology and automatic hyper-linking. Extracted entities are mapped into a domain onto...
Stephan Busemann, Witold Drozdzynski, Hans-Ulrich ...
Even in a massive corpus such as the Web, a substantial fraction of extractions appear infrequently. This paper shows how to assess the correctness of sparse extractions by utiliz...
During the past decade there have been significant advances in the field of Natural Language Processing (NLP) and, in particular, Information Extraction (IE) [2] which have fueled...
Kiyoshi Sudo, Amit Bagga, Lawrence O'Gorman, Jon L...