This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in two levels. First, the acoustic a...
We apply independent component analysis (ICA) for extracting an optimal basis to the problem of finding efficient features for a speaker. The basis functions learned by the algori...
Distributed clientisewer models are becoming increasingly prevalent in multimedia systems and advanced user interface design. A multimedia application, for example, may play and r...
The current research efforts in the field of video parsing and analysis are mainly focused on the use of pictorial information, while neglecting an important supplementary source ...
Massimo De Santo, Gennaro Percannella, Carlo Sanso...
Person identification using audio (speech) and visual (facial appearance, static or dynamic) modalities, either independently or jointly, is a thoroughly investigated problem in pa...