We have previously reported on ProPOSEL, a purpose-built Prosody and PoS English Lexicon compatible with the Python Natural Language ToolKit. ProPOSEC is a new corpus research res...
High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or au...
Julien Fayolle, Fabienne Moreau, Christian Raymond...
While automatic methods for phonetic segmentation of speech can help with rapid annotation of corpora, most methods rely either on manually segmented data to initially train the p...
In large vocabulary continuous speech recognition, decision trees are widely used to cluster triphone states. In addition to commonly used phonetically based questions, others hav...
Hank Liao, Christopher Alberti, Michiel Bacchiani,...
CASIA-CASSIL is a large-scale corpus base of Chinese human-human naturally-occurring telephone conversations in restricted domains. The first edition consists of 792 90-second con...