We present an approach to detecting and recognizing spoken isolated phrases based solely on visual input. We adopt an architecture that first employs discriminative detection of ...
Kate Saenko, Karen Livescu, Michael Siracusa, Kevi...
In this paper two new feature tracking algorithms are proposed. In the first algorithm, a perspective camera model is used. Making use of the projective inuariant of Barrett, and ...
In this paper, we compare several approaches for the extraction of modulation frequency features from speech signal using a phoneme recognition system. The general framework in th...
—Data fusion is the process of integrating multiple sources of information such that their combination yields better results than if the data sources are used individually. This ...
The impact of image feature locations in the bag-of-words model for object classification is examined. It is demonstrated that a simple variance-based method works well and offer...