Sampling-based methods have previously been proposed for the problem of finding interesting associations in data, even for low-support items. While these methods do not guarantee ...
—Segmentation, the task of splitting a long sequence of discrete symbols into chunks, can provide important information about the nature of the sequence that is understandable to...
A pervasive problem in large relational databases is identity uncertainty which occurs when multiple entries in a database refer to the same underlying entity in the world. Relati...
This work examines under what conditions compression methodologies can retain the outcome of clustering operations. We focus on the popular k-Means clustering algorithm and we dem...
Deepak S. Turaga, Michail Vlachos, Olivier Versche...
— Redistricting is the process of dividing a geographic area into districts or zones. This process has been considered in the past as a problem that is computationally too comple...