An enormous amount of information available via the Internet exists. Much of this data is in the form of text-based documents. These documents cover a variety of topics that are v...
Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
In this article the activities of the INEX 2005 Multimedia track are reported. We succesfully realized our objective, to provide an evaluation platform for the evaluation of retrie...
This paper discusses aspects of the redocumentation of legacy systems and proposes a model oriented approach to generating documentation, which is to produce models from existing ...
This paper focuses on a method for the stylistic segmentation of text documents. Our technique involves mapping the change in a feature throughout a text. We use the linguistic fe...