Abstract. This paper proposes a novel method for speaker identification based on both speech utterances and their transcribed text. The transcribed text of each speaker's utte...
This paper describes the main components of MiPad (Multimodal Interactive PAD) and especially its distributed speech processing aspects. MiPad is a wireless mobile PDA prototype th...
Li Deng, Kuansan Wang, Alex Acero, Hsiao-Wuen Hon,...
In this paper, a large-scale real-world speech database is introduced along with other multimedia driving data. We designed a data collection vehicle equipped with various sensors...
This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the ...
This paper presents a multimodal crisis management system (XISM). It employs processing of natural gesture and speech commands elicited by a user to efficiently manage complex dyn...
Nils Krahnstoever, Emilio Schapira, Sanshzar Kette...