Sciweavers

EMNLP
2008

Revealing the Structure of Medical Dictations with Conditional Random Fields

13 years 6 months ago
Revealing the Structure of Medical Dictations with Conditional Random Fields
Automatic processing of medical dictations poses a significant challenge. We approach the problem by introducing a statistical framework capable of identifying types and boundaries of sections, lists and other structures occurring in a dictation, thereby gaining explicit knowledge about the function of such elements. Training data is created semiautomatically by aligning a parallel corpus of corrected medical reports and corresponding transcripts generated via automatic speech recognition. We highlight the properties of our statistical framework, which is based on conditional random fields (CRFs) and implemented as an efficient, publicly available toolkit. Finally, we show that our approach is effective both under ideal conditions and for real-life dictation involving speech recognition errors and speech-related phenomena such as hesitation and repetitions.
Jeremy Jancsary, Johannes Matiasek, Harald Trost
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where EMNLP
Authors Jeremy Jancsary, Johannes Matiasek, Harald Trost
Comments (0)