We present spatio-temporal feature descriptors that can be inferred from video and used as building blocks in action recognition systems. They capture the evolution of ``elementar...
Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human act...
Kaustubh Kulkarni, Edmond Boyer, Radu Horaud, Amit...
We believe intelligence does not dwell solely in brain but emerges from active interactions with environments through perception, action, and communication. This paper give an over...