Sciweavers

LREC
2010

Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank

13 years 6 months ago
Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
The Quranic Arabic Dependency Treebank (QADT) is part of the Quranic Arabic Corpus (http://corpus.quran.com), an online linguistic resource organized by the University of Leeds, and developed through online collaborative annotation. The website has become a popular study resource for Arabic and the Quran, and is now used by over 1,500 researchers and students daily. This paper presents the treebank, explains the choice of syntactic representation ( ), and highlights key parts of the annotation guidelines. The text being analyzed is the Quran, the central religious book of Islam, written in classical Quranic Arabic (c. 600 CE). To date, all 77,430 words of the Quran have a manually verified morphological analysis, and syntactic analysis is in progress. 11,000 words of Quranic Arabic have been syntactically annotated as part of a gold standard treebank ( ). Annotation guidelines are especially important to promote consistency for a corpus which is being developed through online collabor...
Kais Dukes, Eric Atwell, Abdul-Baquee M. Sharaf
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Kais Dukes, Eric Atwell, Abdul-Baquee M. Sharaf
Comments (0)