Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires mode...
We propose a new method to quickly and accurately predict 3D positions of body joints from a single depth image, using no temporal information. We take an object recognition appro...
Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Andrew...
Many real-world applications call for learning predictive relationships from multi-modal data. In particular, in multi-media and web applications, given a dataset of images and th...
A well-built dataset is a necessary starting point for advanced computer vision research. It plays a crucial role in evaluation and provides a continuous challenge to stateof-the-...
Recognizing texts from camera images is a known hard problem because of the difficulties in text segmentation from the varied and complicated backgrounds. In this paper, we propo...