Spoken language translation from parallel speech audio: Simultaneous interpretation as SLT training data

15 years 2 months ago

Download www.makapa.de

In recent work, we proposed an alternative to parallel text as translation model (TM) training data: audio recordings of parallel speech (pSp), as it occurs in any communication scenario where interpreters are involved. Although interpretation compares poorly to translation, we reported surprisingly strong translation results for systems based on pSp trained TMs. This work extends the use of pSp as a data source for unsupervised training of all major models involved in statistical spoken language translation. We consider the scenario of speech translation between a resource rich and a resource-deﬁcient language. Our seed models are based on 10h of transcribed audio and parallel text comprised of 100k translated words. With the help of 92h of untranscribed pSp audio, and by taking advantage of the redundancy inherent to pSp (the same information is given twice, in two languages), we report signiﬁcant improvements for the resourcedeﬁcient acoustic, language and translation models....

Matthias Paulik, Alex Waibel

Real-time Traffic

ICASSP 2010 | Parallel Speech | Signal Processing | Spoken Language Translation | Translation Models |

claim paper

» Rapid development of speech translation using consecutive interpretation

» Construction of ChunkAligned Bilingual Lecture Corpus for Simultaneous Machine Translation

Post Info
More Details (n/a)

Added	25 Jan 2011
Updated	25 Jan 2011
Type	Journal
Year	2010
Where	ICASSP
Authors	Matthias Paulik, Alex Waibel

Comments (0)

Sciweavers

Spoken language translation from parallel speech audio: Simultaneous interpretation as SLT training data

ICASSP 2010 | Parallel Speech | Signal Processing | Spoken Language Translation | Translation Models |

Explore & Download

Productivity Tools

Sciweavers