Abstract. This paper presents an architecture that enables the recognizer to learn incrementally and, thereby adapt to document image collections for performance improvement. We ar...
Tables are ubiquitous. Unfortunately, no search engine supports table search. In this paper, we propose a novel table specific searching engine, TableSeer, to facilitate the table...
Low-dimensional topic models have been proven very useful for modeling a large corpus of documents that share a relatively small number of topics. Dimensionality reduction tools s...
The following paper presents a case study of the way that the research done by the LAIRAH project may be applied in the case of a real digital resource for humanities scholarship....
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...