Clustering algorithms such as k-means, the self-organizing map (SOM), or Neural Gas (NG) constitute popular tools for automated information analysis. Since data sets are becoming l...
Constrained clustering has been well-studied for algorithms like K-means and hierarchical agglomerative clustering. However, how to encode constraints into spectral clustering rem...
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata manage...
Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Da...
Distributing data is a fundamental problem in implementing efficient distributed-memory parallel programs. The problem becomes more difficult in environments where the participa...
D. Brent Weatherly, David K. Lowenthal, Mario Naka...