Sciweavers

LREC
2010

Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging

13 years 6 months ago
Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging
In this paper, we present the results of an experiment with utilizing a stochastic morphosyntactic tagger as a pre-processing module of a rule-based chunker and partial parser for Croatian in order to raise its overall chunking and partial parsing accuracy on Croatian texts. In order to conduct the experiment, we have manually chunked and partially parsed 459 sentences from the Croatia Weekly 100 kw newspaper sub-corpus taken from the Croatian National Corpus, that were previously also morphosyntactically disambiguated and lemmatized. Due to the lack of resources of this type, these sentences were designated as a temporary chunking and partial parsing gold standard for Croatian. We have then evaluated the chunker and partial parser in three different scenarios: (1) chunking previously morphosyntactically untagged text, (2) chunking text that was tagged using the stochastic morphosyntactic tagger for Croatian and (3) chunking manually tagged text. The obtained F1-scores for the three s...
Kristina Vuckovic, Zeljko Agic, Marko Tadic
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Kristina Vuckovic, Zeljko Agic, Marko Tadic
Comments (0)