The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
A labeled sequence data set related to a certain biological property is often biased and, therefore, does not completely capture its diversity in nature. To reduce this sampling b...
In this paper, we derive a data mining framework to analyze 3D features on human faces. The framework leverages kernel density estimators, genetic algorithm and an information com...
Sreenivas R. Sukumar, Hamparsum Bozdogan, David L....
Semi-supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution ...
A successful interpretation of data goes through discovering crucial relationships between variables. Such a task can be accomplished by a Bayesian network. The dark side is that, ...