Path profiles provide a more accurate characterization of a program's dynamic behavior than basic block or edge profiles, but are relatively more expensive to collect. This has limited their use in practice despite demonstrations of their advantages over edge profiles for a wide variety of applications. We present a new algorithm called preferential path profiling (PPP), that reduces the overhead of path profiling. PPP leverages the observation that most consumers of path profiles are only interested in a subset of all program paths. PPP achieves low overhead by separating interesting paths from other paths and assigning a set of unique and compact numbers to these interesting paths. We draw a parallel between arithmetic coding and path numbering, and use this connection to prove an optimality result for the compactness of path numbering produced by PPP. This compact path numbering enables our PPP implementation to record path information in an array instead of a hash table. Our ...
Kapil Vaswani, Aditya V. Nori, Trishul M. Chilimbi