In this paper we present the design, implementation and evaluation of a runtime system based on collective I/O techniques for irregular applications. We present two models, namely...
Achieving high performance on today’s architectures requires careful orchestration of many optimization parameters. In particular, the presence of shared-caches on multicore arch...
We present an approach for recovering articulated body pose from single monocular images using the Specialized Mappings Architecture (SMA), a non-linear supervised learning archit...
Most similarity search techniques map the data objects into some high-dimensional feature space. The similarity search then corresponds to a nearest-neighbor search in the feature...
We consider the filter decomposition problem in supporting coarse-grained pipelined parallelism. This form of parallelism is suitable for data-driven applications in scenarios wh...