Accessing specific or salient parts of multimedia recordings remains a challenge as there is no obvious way of structuring and representing a mix of space-based and timebased med...
We propose a new approach for combining acoustic and visual measurements to aid in recognizing lip shapes of a person speaking. Our method relies on computing the maximum likeliho...
Alignment combination (symmetrization) has been shown to be useful for improving Machine Translation (MT) models. Most existing alignment combination techniques are based on heuri...
This paper describes work in progress on automatic generation of "impact sounds" based on physical modelling. These sounds can be used as non-speech audio presentation of...
Alireza Darvishi, Valentin Guggiana, Eugen Muntean...
Research activity on the Portuguese language for speech synthesis and recognition has suffered from a considerable lack of human and material resources. This has raised some obsta...