Using Linguistic Principles to Recover Empty Categories

9 years 7 months ago
Using Linguistic Principles to Recover Empty Categories
This paper describes an algorithm for detecting empty nodes in the Penn Treebank (Marcus et al., 1993), finding their antecedents, and assigning them function tags, without access to lexical information such as valency. Unlike previous approaches to this task, the current method is not corpus-based, but rather makes use of the principles of early Government-Binding theory (Chomsky, 1981), the syntactic theory that underlies the annotation. Using the evaluation metric proposed by Johnson (2002), this approach outperforms previously published approaches on both detection of empty categories and antecedent identification, given either annotated input stripped of empty categories or the output of a parser. Some problems with this evaluation metric are noted and an alternative is proposed along with the results. The paper considers the reasons a principlebased approach to this problem should outperform corpus-based approaches, and speculates on the possibility of a hybrid approach.
Richard Campbell
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where ACL
Authors Richard Campbell
Comments (0)