We address the problem of automatic interpretation of nonexaggerated human facial and body behaviours captured in video. We illustrate our approach by three examples. (1) We intro...
—Demand for bus surveillance is growing due to the increased threats of terrorist attack, vandalism and litigation. However, CCTV systems are traditionally used in forensic mode,...
The detection of prosodic characteristics is an important aspect of both speech synthesis and speech recognition. Correct placement of pitch accents aids in more natural sounding ...
In this paper we investigate the combination of complementary acoustic feature streams in large vocabulary continuous speech recognition (LVCSR). We have explored the use of acoust...
In this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual ...