The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how...
XML is rapidly emerging as the new standard for data representation and exchange on the Web. An XML document can be accompanied by a Document Type Descriptor (DTD) which plays the...
Minos N. Garofalakis, Aristides Gionis, Rajeev Ras...
This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
Data retrieved from body sensors such as ECG machines and new-generation multi-sensor systems such as respiratory monitors are varied and abundant. Managing and integrating this s...
Alfonso F. Cardenas, Raymond K. Pon, Robert B. Cam...
Information Extraction (IE) technology is facing new challenges of dealing with large-scale heterogeneous data sources from different documents, languages and modalities. Informat...