The maximum a posteriori (MAP) criterion is broadly used in the statistical model-based voice activity detection (VAD) approaches. In the conventional MAP criterion, however, the ...
Tracking human lips in video is an important but notoriously dicult task. To accurately recover their motions in 3D from any head pose is an even more challenging task, though s...
A top-down task-dependent model guides attention to likely target locations in cluttered scenes. Here, a novel biologically plausible top-down auditory attention model is presente...
This paper proposes a speech comprehension computational model based on neurocognitiveresearches. The computational representation uses techniques as wavelets transform and connec...
The present work aims to model the correspondence between facial motion and speech. The face and sound are modelled separately, with phonemes being the link between both. We propo...