Over the years the amount and range of electronic text stored on the WWW has expanded rapidly, overwhelming both users and tools designed to index and search the information. It is...
Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid...
We present our hybrid system for the PAN challenge at CLEF 2010. Our system performs plagiarism detection for translated and non-translated externally as well as intrinsically plag...
Markus Muhr, Roman Kern, Mario Zechner, Michael Gr...
Versioned and bug-tracked software systems provide a huge amount of historical data regarding source code changes and issues management. In this paper we deal with impact analysis...
Information retrieval systems can be partitioned into two main classes: large-scale systems that make use of an inverted index or some other auxiliary data structure, intended for...