Background: A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.e., data obtained from different chip manufacturers)...
Todd H. Stokes, J. T. Torrance, Henry Li, May D. W...
Today’s users want to access their data everywhere and any time – in various environments and occasions. The data itself can be very complex – the problem is then in providi...
Vladislav Nemec, Pavel Zikovsky, Pavel Slaví...
Efficiently detecting outliers or anomalies is an important problem in many areas of science, medicine and information technology. Applications range from data cleaning to clinica...
Matthew Eric Otey, Amol Ghoting, Srinivasan Partha...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Data is often stored in summarized form, as a histogram of aggregates (COUNTs, SUMs, or AVeraGes) over speci ed ranges. We study how to estimate the original detail data from the ...
Christos Faloutsos, H. V. Jagadish, Nikolaos Sidir...