Multimodal grammars provide an expressive formalism for multimodal integration and understanding. However, handcrafted multimodal grammars can be brittle with respect to unexpecte...
We investigate techniques for acoustic modeling in automatic recognition of context-independent phoneme strings from the TIMIT database. The baseline phoneme recognizer is based on...
We propose a robust scene recognition system for baseball broadcast videos. This system is based on the data-driven approach which has been successful in continuous speech recogni...
With the purpose of improving Spoken Language Understanding (SLU) performance, a combination of different acoustic speech recognition (ASR) systems is proposed. State a-posteriori...
This paper describes the CMU/InterACT effort in developing an Arabic Automatic Speech Recognition (ASR) system for broadcast news and conversations within the GALE 2006 evaluation...