During data warehouse design, the designer frequently encounters the problem of choosing among different alternatives for the same design construct. The behavior of the chosen desi...
George Papastefanatos, Panos Vassiliadis, Alkis Si...
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
Several marketing problems involve prediction of customer purchase behavior and forecasting future preferences. We consider predictive modeling of large scale, bi-modal or multimo...
In this work we propose a novel approach to anomaly detection in streaming communication data. We first build a stochastic model for the system based on temporal communication pa...
XML is becoming a prevalent format for data exchange. Many XML documents have complex schemas that are not always known, and can vary widely between information sources and applica...
Eugene Agichtein, C. T. Howard Ho, Vanja Josifovsk...