Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
Unexpected rules are interesting because they are either previously unknown or deviate from what prior user knowledge would suggest. In this paper, we study three important issues...
Projected clustering has become a hot research topic due to its ability to cluster high-dimensional data. However, most existing projected clustering algorithms depend on some cri...
High dimensional directional data is becoming increasingly important in contemporary applications such as analysis of text and gene-expression data. A natural model for multivaria...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal o...