Sciweavers

ACL
2006

A Phonetic-Based Approach to Chinese Chat Text Normalization

13 years 5 months ago
A Phonetic-Based Approach to Chinese Chat Text Normalization
Chatting is a popular communication media on the Internet via ICQ, chat rooms, etc. Chat language is different from natural language due to its anomalous and dynamic natures, which renders conventional NLP tools inapplicable. The dynamic problem is enormously troublesome because it makes static chat language corpus outdated quickly in representing contemporary chat language. To address the dynamic problem, we propose the phonetic mapping models to present mappings between chat terms and standard words via phonetic transcription, i.e. Chinese Pinyin in our case. Different from character mappings, the phonetic mappings can be constructed from available standard Chinese corpus. To perform the task of dynamic chat language term normalization, we extend the source channel model by incorporating the phonetic mapping models. Experimental results show that this method is effective and stable in normalizing dynamic chat language terms.
Yunqing Xia, Kam-Fai Wong, Wenjie Li
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where ACL
Authors Yunqing Xia, Kam-Fai Wong, Wenjie Li
Comments (0)