Abstract. Clustering is a problem of great practical importance in numerous applications. The problem of clustering becomes more challenging when the data is categorical, that is, ...
We consider a model in which background knowledge on a given domain of interest is available in terms of a Bayesian network, in addition to a large database. The mining problem is...
This article explores how to develop complex data driven user models that go beyond the bag of words model and topical relevance. We propose to learn from rich user specific info...
Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been li...
Kuang Chen, Harr Chen, Neil Conway, Joseph M. Hell...
Software evolution research inherently has several resourceintensive logistical constraints. Archived project artifacts, such as those found in source code repositories and bug tr...
Jennifer Bevan, E. James Whitehead Jr., Sunghun Ki...