This paper examines the performance of simultaneous multithreading (SMT) for network servers using actual hardware, multiple network server applications, and several workloads. Us...
Yaoping Ruan, Vivek S. Pai, Erich M. Nahum, John M...
Optimal network performance is critical to efficient parallel scaling for communication-bound applications on large machines. With wormhole routing, no-load latencies do not increa...
Abhinav Bhatele, Eric J. Bohm, Laxmikant V. Kal&ea...
3D-integration is a promising technology to help combat the “Memory Wall” in future multi-core processors. Past work has considered using 3D-stacked DRAM as a large last-level...
Delaunaytriangulationhas beenmuchusedin suchapplicationsas volumerendering, shape representation, terrain modeling and so on. The main disadvantage of Delaunay triangulationis lar...
A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large c...
Rebecca Cathey, Eric C. Jensen, Steven M. Beitzel,...