Cross-lingual audio-to-text alignment for multimedia content management

15 years 5 months ago

Download www.iis.sinica.edu.tw

This paper addresses a content management problem in situations where we have a collection of spoken documents in audio stream format in one language and a collection of related text documents in another. In our case, we have a huge digital archive of audio broadcast news in Taiwanese, but we do not have transcriptions for it. Meanwhile, we have a collection of related text-based news stories, but they are written in Chinese characters. Due to the lack of a standard written form for Taiwanese, manual transcription of spoken documents is prohibitively expensive, and automatic transcription by speech recognition is infeasible because of its poor performance for Taiwanese spontaneous speech. We present an approximate solution by aligning Taiwanese spoken documents with related text documents in Mandarin. The idea is to take advantage of the abundance of Mandarin text documents available in our application to compensate for the limitations of speech recognition systems. Experimental resul...

Dau-Cheng Lyu, Ren-Yuan Lyu, Yuang-Chin Chiang, Ch

Real-time Traffic

DSS 2008 | Spoken Documents | Taiwanese Spoken Documents | Text Documents |

claim paper

Added	10 Dec 2010
Updated	10 Dec 2010
Type	Journal
Year	2008
Where	DSS
Authors	Dau-Cheng Lyu, Ren-Yuan Lyu, Yuang-Chin Chiang, Chun-Nan Hsu

Sciweavers

Cross-lingual audio-to-text alignment for multimedia content management

DSS 2008 | Spoken Documents | Taiwanese Spoken Documents | Text Documents |

Explore & Download

Productivity Tools

Sciweavers