This paper presents a novel audio-visual fusion method for speech detection, which is an important front-end for content-based video processing. This approach aims to extract homo...
Cong Li, Zhijian Ou, Wei Hu, Tao Wang, Yimin Zhang
In this paper, we present a systems approach for channel modeling of an Automatic Speech Recognition (ASR) system. This can have implications in improving speech recognition compo...
Qun Feng Tan, Kartik Audhkhasi, Panayiotis G. Geor...
The regular occurrence of disfluencies is a distinguishing characteristic of spontaneous speech. Detecting and removing such disfluencies can substantially improve the usefulness ...
Many of the kinds of language model used in speech understanding suffer from imperfect modeling of intra-sentential contextual influences. I argue that this problem can be address...
Abstract--Context plays a valuable role in any image understanding task confirmed by numerous studies which have shown the importance of contextual information in computer vision t...
Sobhan Naderi Parizi, Ivan Laptev, Alireza Tavakol...