XML is rapidly emerging as the new standard for data representation and exchange on the Web. An XML document can be accompanied by a Document Type Descriptor (DTD) which plays the...
Minos N. Garofalakis, Aristides Gionis, Rajeev Ras...
The problem of Writer Verification is to make a decision of whether or not two handwritten documents are written by the same person. Providing a strength of evidence for any such ...
Harish Srinivasan, S. Kabra, Chen Huang, Sargur N....
Generative models such as statistical language modeling have been widely studied in the task of expert search to model the relationship between experts and their expertise indicat...
This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
Information extraction (IE) aims at extracting specific information from a collection of documents. A lot of previous work on 10 from semi-structured documents (in XML or HTML) us...
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bu...