Class posterior distributions have recently been used quite successfully in Automatic Speech Recognition (ASR), either for frame or phone level classification or as acoustic featu...
We describe a parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases. Users can speak or type, and they can...
Multi-pitch estimation of co-channel speech is especially challenging when the underlying pitch tracks are close in pitch value (e.g., when pitch tracks cross). Building on our pr...
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. One such mismatch occurs when a SID system ...
Local business voice search is a popular application for mobile phones, where hands-free interaction and speed are critical to users. However, speech recognition accuracy is still...
Giuseppe Di Fabbrizio, Diamantino Caseiro, Amanda ...