Neural network language models (NNLM) have become an increasingly popular choice for large vocabulary continuous speech recognition (LVCSR) tasks, due to their inherent generalisa...
Junho Park, Xunying Liu, Mark J. F. Gales, Philip ...
We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) - the dominant features used for speech recognition - and investigate their applicability to modeling music. ...
In this paper, we describe a new multi-purpose audio-visual database on the context of speech interfaces for controlling household electronic devices. The database comprises speec...
SOM and LVQ algorithms for symbol strings have been introduced and applied to isolatedword recognition, for the construction of an optimal pronunciation dictionary for a given spe...
Editing speech data is currently time-consuming and errorprone. Speech editors rely on acoustic waveform representations, which force users to repeatedly sample the underlying spe...