In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN) capable of robustly categorizing timewarped speech data. We measure ...
The performance of spoken language recognition system is typically formulated to reflect the detection cost and the strategic decision points along the detection-error-tradeoff cur...
A voice search system requires a speech interface that can correctly recognize spoken queries uttered by users. The recognition performance strongly relies on a robust language mo...
Xiao Li, Patrick Nguyen, Geoffrey Zweig, Dan Bohus
This paper presents a new strategy for designing the parallel phone recognizers for spoken language recognition. Given a collection of parallel phone recognizers, we select a subs...
In this paper we propose a new technique to enhance emotion recognition by combining in different ways what we call emotion predictions. The technique is called F2 as the combinat...