Low-dimensional topic models have been proven very useful for modeling a large corpus of documents that share a relatively small number of topics. Dimensionality reduction tools s...
We present a method for automated topic suggestion. Given a plain-text input document, our algorithm produces a ranking of novel topics that could enrich the input document in a m...
The choice of the best binarization algorithm is very critical for any document image processing system, since it is one of the first tasks and any mistake it performs will be car...
Social annotation via so-called collaborative tagging describes the process by which many users add metadata in the form of unstructured keywords to shared content. In this paper,...
The dominance of digital objects in today's information landscape has changed the way humankind creates and exchanges information. However, it has also brought an entirely ne...
Christoph Becker, Andreas Rauber, Volker Heydegger...
This paper presents a framework for automatically generating structural XML documents. The user provides a target DTD and an example of an XML document, called a Generate-XML-ByEx...
The creation and deployment of knowledge repositories for managing, sharing, and reusing tacit knowledge within an organization has emerged as a prevalent approach in current know...
The problem of document replacement in web caches has received much attention in recent research, and it has been shown that the eviction rule "replace the least recently used...
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...