Word-Based Dialect Identification with Georeferenced Rules

15 years 5 months ago

Download www.aclweb.org

We present a novel approach for (written) dialect identification based on the discriminative potential of entire words. We generate Swiss German dialect words from a Standard German lexicon with the help of hand-crafted phonetic/graphemic rules that are associated with occurrence maps extracted from a linguistic atlas created through extensive empirical fieldwork. In comparison with a charactern-gram approach to dialect identification, our model is more robust to individual spelling differences, which are frequently encountered in non-standardized dialect writing. Moreover, it covers the whole Swiss German dialect continuum, which trained models struggle to achieve due to sparsity of training data.

Yves Scherrer, Owen Rambow

Real-time Traffic

EMNLP 2010 | German Dialect Continuum | German Dialect Words | Natural Language Processing | Swiss German Dialect |

claim paper

Post Info
More Details (n/a)

Added	11 Feb 2011
Updated	11 Feb 2011
Type	Journal
Year	2010
Where	EMNLP
Authors	Yves Scherrer, Owen Rambow

Comments (0)

Sciweavers

Word-Based Dialect Identification with Georeferenced Rules

EMNLP 2010 | German Dialect Continuum | German Dialect Words | Natural Language Processing | Swiss German Dialect |

Explore & Download

Productivity Tools

Sciweavers