Reshaping automatic speech transcripts for robust high-level spoken document analysis

13 years 2 months ago

Download www.irisa.fr

High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or automatic summarization. It is nevertheless a difficult task that is generally based on transcripts provided by an automatic speech recognition system. Unlike standard texts, transcripts belong to the category of highly noisy data because of word recognition errors that affect, in particular, very significant words such as named entities (e.g. person's names, locations, organizations). Transcripts also contain specificities of spoken language that make ineffective their processing by natural language processing tools designed for texts. To overcome these issues, this paper proposes a method to reshape automatic speech transcripts for robust high-level spoken document analysis. The method consists in conceiving a new word-level confidence measure that may efficiently ensure the reliability of transcribed wo...

Julien Fayolle, Fabienne Moreau, Christian Raymond

Real-time Traffic

AND 2010 | Automatic Speech | Automatic Speech Recognition | Document Analysis | Machine Learning |

claim paper

» Robust Question Answering for Speech Transcripts UPC Experience in QAst 2009

» Robust Question Answering for Speech Transcripts UPC Experience in QAst 2008

» Automatic Rich Annotation of Large Corpus of Conversational transcribed speech the Chunkin...

» Dublin City University at CLEF 2006 CrossLanguage Speech Retrieval CLSR Experiments

» Unsupervised Topic Modelling for MultiParty Spoken Discourse

» Term clouds as surrogates for user generated speech

» A Critical Reassessment of Evaluation Baselines for Speech Summarization

Post Info
More Details (n/a)

Added	10 Feb 2011
Updated	10 Feb 2011
Type	Journal
Year	2010
Where	AND
Authors	Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier

Comments (0)

Sciweavers

Reshaping automatic speech transcripts for robust high-level spoken document analysis

AND 2010 | Automatic Speech | Automatic Speech Recognition | Document Analysis | Machine Learning |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers