Sciweavers

LREC
2010

Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)

13 years 6 months ago
Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)
In this paper, we report on the design of a part-of-speech-tagset for Wolof and on the creation of a semi-automatically annotated gold standard. The main motivation for this resource is to obtain data for training automatic taggers with machine learning approaches. Hence, we take machine learning considerations into account during tagset design and present training experiments as part of this paper. The best automatic tagger achieves an accuracy of 95.2% in cross-validation experiments. We also wanted to create a basis for experimenting with annotation projection techniques, which exploit parallel corpora. For this reason, it was useful to use a part of the Bible as the gold standard corpus, for which sentence-aligned parallel versions in many languages are easy to obtain.
Cheikh M. Bamba Dione, Jonas Kuhn, Sina Zarrie&szl
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Cheikh M. Bamba Dione, Jonas Kuhn, Sina Zarrieß
Comments (0)