In this paper, we introduce a system that synthesizes the emotional audio-visual speech for a 3-D talking agent by adopting the PAD (Pleasure-Arousal-Dominance) emotional model. A ...
Abstract. Dialogue moves influence and are influenced by the agents’ preferences. We propose a method for modelling this interaction. We motivate and describe a recursive metho...
In this paper we present a noise level estimator using minimal values of the Short Time Fourier Transform of a signal embedded in a white Gaussian noise. The spectral kurtosis of ...
Various text mining algorithms require the process of feature selection. High-level semantically rich features, such as figurative language uses, speech errors etc., are very prom...
We address the problem of formatting the output of an automatic speech recognition (ASR) system for readability, while preserving wordlevel timing information of the transcript. O...