Abstract. We present a Bayesian approach for simultaneously estimating the number of people in a crowd and their spatial locations by sampling from a posterior distribution over crowd conﬁgurations. Although this framework can be naturally extended from single to multiview detection, we show that the naive extension leads to an ineﬃcient sampler that is easily trapped in local modes. We therefore develop a set of novel proposals that leverage multiview geometry to propose global moves that jump more eﬃciently between modes of the posterior distribution. We also develop a statistical model of crowd conﬁgurations that can handle dependencies among people and while not requiring discretization of their spatial locations. We quantitatively evaluate our algorithm on a publicly available benchmark dataset with diﬀerent crowd densities and environmental conditions, and show that our approach outperforms other state-of-the-art methods for detecting and counting people in crowds.