The problem of joint modeling the text and image components of multimedia documents is studied. The text component is represented as a sample from a hidden topic model, learned wi...
Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Cov...
Legal information certification and secured storage combined with documents electronic signature are of great interest when digital documents security and conservation are in conce...
Maxime Wack, Ahmed Nait-Sidi-Moh, Sid Lamrous, Nat...
Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent items...
: BigBatch is an image processing environment designed to process batches of thousands of monochromatic documents. One of the flexibilities and pioneer aspects of BigBatch is offer...
First generation Web-content encodes information in handwritten (HTML) Web pages. Second generation Web content generates HTML pages on demand, e.g. by filling in templates with c...
Jacco van Ossenbruggen, Joost Geurts, Frank Cornel...