Sciweavers

72 search results - page 2 / 15
» A hierarchical representation of form documents for identifi...
Sort
View
ERCIMDL
1997
Springer
130views Education» more  ERCIMDL 1997»
13 years 9 months ago
Modelling the Retrieval of Structured Documents Containing Texts and Images
Abstract. We present a model for complex documents possibly consisting of a hierarchically structured set of images or texts. Documents are represented both at the form level (as s...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
ICDAR
2005
IEEE
13 years 11 months ago
Towards a Canonical and Structured Representation of PDF Documents through Reverse Engineering
This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original document layout structure. Xed mixes electronic extraction methods with state-...
Maurizio Rigamonti, Jean-Luc Bloechle, Karim Hadja...
IPM
2008
123views more  IPM 2008»
13 years 5 months ago
Effectiveness of additional representations for the search result presentation on the web
The presentation of search results on the web has been dominated by the textual form of document representation. On the other hand, the document's visual aspects such as the ...
Hideo Joho, Joemon M. Jose
ICML
2008
IEEE
14 years 6 months ago
Semi-supervised learning of compact document representations with deep networks
Finding good representations of text documents is crucial in information retrieval and classification systems. Today the most popular document representation is based on a vector ...
Marc'Aurelio Ranzato, Martin Szummer
DGO
2006
134views Education» more  DGO 2006»
13 years 6 months ago
Next steps in near-duplicate detection for eRulemaking
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Hui Yang, Jamie Callan, Stuart W. Shulman