The crucial issue in many classification applications is how to achieve the best possible classifier with a limited number of labeled data for training. Training data selection is ...
Data is often stored in summarized form, as a histogram of aggregates (COUNTs, SUMs, or AVeraGes) over speci ed ranges. We study how to estimate the original detail data from the ...
Christos Faloutsos, H. V. Jagadish, Nikolaos Sidir...
Background: Overfitting the data is a salient issue for classifier design in small-sample settings. This is why selecting a classifier from a constrained family of classifiers, on...
Jianping Hua, James Lowey, Zixiang Xiong, Edward R...
Exploratory data mining is fundamental to fostering an appreciation of complex datasets. For large and continuously growing datasets, such as obtained by regular sampling of an or...
Background: Recent progresses in genotyping technologies allow the generation high-density genetic maps using hundreds of thousands of genetic markers for each DNA sample. The ava...