Background: High throughput methods of the genome era produce vast amounts of data in the form of gene lists. These lists are large and difficult to interpret without advanced com...
Despite outstanding successes of the state-of-the-art clustering algorithms, many of them still suffer from shortcomings. Mainly, these algorithms do not capture coherency and homo...
We present the IBM systems for the Rich Transcription 2007 (RT07) speaker diarization evaluation task on lecture meeting data. We first overview our baseline system that was devel...
We present a new L1-distance-based k-means clustering algorithm to address the challenge of clustering high-dimensional proportional vectors. The new algorithm explicitly incorpor...
Bonnie K. Ray, Hisashi Kashima, Jianying Hu, Monin...
Traditional clustering algorithms work on "flat" data, making the assumption that the data instances can only be represented by a set of homogeneous and uniform features...
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G...