Comma Restoration Using Constituency Information

8 years 11 months ago
Comma Restoration Using Constituency Information
Automatic restoration of punctuation from unpunctuated text has application in improving the fluency and applicability of speech recognition systems. We explore the possibility that syntactic information can be used to improve the performance of an HMM-based system for restoring punctuation (specifically, commas) in text. Our best methods reduce sentence error rate substantially — by some 20%, with an additional 8% reduction possible given improvements in extraction of the requisite syntactic information. 1 Motivation The move from isolated word to connected speech recognition engendered a qualitative improvement in the naturalness of users’ interactions with speech transcription systems, sufficient even to make up in user satisfaction for some modest increase in error rate. Nonetheless, such systems still retain an important source of unnaturalness in dictation, the requirement to utter all punctuation explicitly. In order to free the user from this burden, a transcription sys...
Stuart M. Shieber, Xiaopeng Tao
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Authors Stuart M. Shieber, Xiaopeng Tao
Comments (0)