This paper is a summary of the author’s thesis that presents a model and an environment for recovering the high level design of legacy software systems based on user defined ar...
Data stream applications have made use of statistical summaries to reason about the data using nonparametric tools such as histograms, heavy hitters, and join sizes. However, rela...
Abstract. A central task when integrating data from different sources is to detect identical items. For example, price comparison websites have to identify offers for identical p...
The Self-Organizing Map is a popular neural network model for data analysis, for which a wide variety of visualization techniques exists. We present a novel technique that takes th...
Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform class...