Abstract. Most real-world datasets are, to a certain degree, skewed. When considered that they are also large, they become the pinnacle challenge in data analysis. More importantly...
Clustering is the process of grouping a set of objects into classes of similar objects. Because of unknownness of the hidden patterns in the data sets, the definition of similari...
This paper is about non-approximate acceleration of high-dimensional nonparametric operations such as k nearest neighbor classifiers. We attempt to exploit the fact that even if w...
In an interactive classification application, a user may find it more valuable to develop a diagnostic decision support method which can reveal significant classification behavior...
Traditional visualization techniques for multidimensional data sets, such as parallel coordinates, glyphs, and scatterplot matrices, do not scale well to high numbers of dimension...
Jing Yang, Matthew O. Ward, Elke A. Rundensteiner,...