Binaural Speech Separation Using Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

15 years 11 months ago

Download groups.inf.ed.ac.uk

A speech separation system is described in which sources are represented in a joint interaural time diﬀerence-fundamental frequency (ITD-F0) cue space. Traditionally, recurrent timing neural networks (RTNNs) have been used only to extract periodicity information; in this study, this type of network is extended in two ways. Firstly, a coincidence detector layer is introduced, each node of which is tuned to a particular ITD; secondly, the RTNN is extended to become twodimensional to allow periodicity analysis to be performed at each bestITD. Thus, one axis of the RTNN represents F0 and the other ITD allowing sources to be segregated on the basis of their separation in ITD-F0 space. Source segregation is performed within individual frequency channels without recourse to across-channel estimates of F0 or ITD that are commonly used in auditory scene analysis approaches. The system is evaluated on spatialised speech signals using energy-based metrics and automatic speech recognition.

Stuart N. Wrigley, Guy J. Brown

Real-time Traffic

ITD Allowing Sources | Machine Learning | MLMI 2007 | Time Diﬀerence-fundamental Frequency | Timing Neural Networks |

claim paper

Post Info
More Details (n/a)

Added	08 Jun 2010
Updated	08 Jun 2010
Type	Conference
Year	2007
Where	MLMI
Authors	Stuart N. Wrigley, Guy J. Brown

Comments (0)

Sciweavers

Binaural Speech Separation Using Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

ITD Allowing Sources | Machine Learning | MLMI 2007 | Time Diﬀerence-fundamental Frequency | Timing Neural Networks |

Explore & Download

Productivity Tools

Sciweavers