We present a multi-camera system for audio-visual analysis of dance figures. The multi-view video of a dancing actor is acquired using 8 synchronized cameras. The motion capture t...
This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the ...
We describe the modification of a grammar to take advantage of prosodic information provided by a speech recognition system. This initial study is limited to the use of relative d...
—We consider representing a short temporal fragment of musical audio as a dynamic texture, a model of both the timbral and rhythmical qualities of sound, two of the important asp...
Luke Barrington, Antoni B. Chan, Gert R. G. Lanckr...
In this paper we present a number of improvements that were recently made to the template based speech recognition system developed at ESAT. Combining these improvements resulted ...
Kris Demuynck, Dino Seppi, Hugo Van hamme, Dirk Va...