FSM-based pronunciation modeling using articulatory phonological code

15 years 9 days ago

Download www.isle.illinois.edu

According to articulatory phonology, the gestural score is an invariant speech representation. Though the timing schemes, i.e., the onsets and offsets, of the gestural activations may vary, the ensemble of these activations tends to remain unchanged, informing the speech content. In this work, we propose a pronunciation modeling method that uses a finite state machine (FSM) to represent the invariance of a gestural score. Given the "canonical" gestural score (CGS) of a word with a known activation timing scheme, the plausible activation onsets and offsets are recursively generated and encoded as a weighted FSM. An empirical measure is used to prune out gestural activation timing schemes that deviate too much from the CGS. Speech recognition is achieved by matching the recovered gestural activations to the FSM-encoded gestural scores of different speech contents. We carry out pilot word classification experiments using synthesized data from one speaker. The proposed pronuncia...

Chi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson

Real-time Traffic

Gestural | Gestural Activations | Gestural Scores | INTERSPEECH 2010 | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	18 May 2011
Updated	18 May 2011
Type	Journal
Year	2010
Where	INTERSPEECH
Authors	Chi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson

Comments (0)

Sciweavers

FSM-based pronunciation modeling using articulatory phonological code

Gestural | Gestural Activations | Gestural Scores | INTERSPEECH 2010 | Signal Processing |

Explore & Download

Productivity Tools

Sciweavers