In this paper, we propose a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Our approach has three unique features. First, we use the c...
We present a large-scale analysis of the content of weblogs dating back to the release of the Blogger program in 1999. Over one million blogs were analyzed from their conception t...
Document clustering has been used for better document retrieval, document browsing, and text mining. In this paper, we investigate if biomedical ontology MeSH improves the cluster...
SimPoint is a technique used to pick what parts of the program’s execution to simulate in order to have a complete picture of execution. SimPoint uses data clustering algorithms...
— We present a solution to the problem of identifying clusters from MIMO measurement data in a data window, with a minimum of user interaction. Conventionally, visual inspection ...
Nicolai Czink, Pierluigi Cera, Jari Salo, Ernst Bo...