Sciweavers

ICASSP
2008
IEEE

An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems

13 years 11 months ago
An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems
As spoken dialogue systems become deployed in increasingly complex domains, they face rising demands on the naturalness of interaction. We focus on system responsiveness, aiming to mimic human-like dialogue flow control by predicting speaker changes as observed in real human-human conversations. We derive an instantaneous vector representation of pitch variation and show that it is amenable to standard acoustic modeling techniques. Using a small amount of automatically labeled data, we train models which significantly outperform current state-of-the-art pause-only systems, and replicate to within 1% absolute the performance of our previously published hand-crafted baseline. The new system additionally offers scope for run-time control over the precision or recall of locations at which to speak.
Kornel Laskowski, Jens Edlund, Mattias Heldner
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICASSP
Authors Kornel Laskowski, Jens Edlund, Mattias Heldner
Comments (0)