In this paper, we present an approach for speaker change detection in broadcast video using joint audio-visual scene change statistics. Our experiments indicate that using joint a...
Automatically extracting semantic content from audio streams can be helpful in many multimedia applications. Motivated by the known limitations of traditional supervised approache...
This article introduces automatic recognition of speech without any audio information. Movements of the tongue, lips, and jaw are tracked by an Electro-Magnetic Articulography (EM...
Multi-stream hidden Markov models (HMMs) have recently been very successful in audio-visual speech recognition, where the audio and visual streams are fused at the final decision...