Sciweavers

ACL
2012
11 years 7 months ago
Tokenization: Returning to a Long Solved Problem - A Survey, Contrastive Experiment, Recommendations, and Toolkit -
We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank style, and present a new rule-based preprocessing toolkit that not only reproduces the Tr...
Rebecca Dridan, Stephan Oepen
ACL
2012
11 years 7 months ago
Head-driven Transition-based Parsing with Top-down Prediction
This paper presents a novel top-down headdriven parsing algorithm for data-driven projective dependency analysis. This algorithm handles global structures, such as clause and coor...
Katsuhiko Hayashi, Taro Watanabe, Masayuki Asahara...
ACL
2011
12 years 8 months ago
Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation
Resolving coordination ambiguity is a classic hard problem. This paper looks at coordination disambiguation in complex noun phrases (NPs). Parsers trained on the Penn Treebank are...
Shane Bergsma, David Yarowsky, Kenneth Ward Church
EMNLP
2009
13 years 2 months ago
Improving Dependency Parsing with Subtrees from Auto-Parsed Data
This paper presents a simple and effective approach to improve dependency parsing by using subtrees from auto-parsed data. First, we use a baseline parser to parse large-scale una...
Wenliang Chen, Jun'ichi Kazama, Kiyotaka Uchimoto,...
EMNLP
2010
13 years 2 months ago
Utilizing Extra-Sentential Context for Parsing
Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treeba...
Jackie Chi Kit Cheung, Gerald Penn
ACL
2010
13 years 2 months ago
Efficient Third-Order Dependency Parsers
We present algorithms for higher-order dependency parsing that are "third-order" in the sense that they can evaluate substructures containing three dependencies, and &qu...
Terry Koo, Michael Collins
EMNLP
2006
13 years 6 months ago
Learning Phrasal Categories
In this work we learn clusters of contextual annotations for non-terminals in the Penn Treebank. Perhaps the best way to think about this problem is to contrast our work with that...
William P. Headden III, Eugene Charniak, Mark John...
ACL
2006
13 years 6 months ago
Trace Prediction and Recovery with Unlexicalized PCFGs and Slash Features
This paper describes a parser which generates parse trees with empty elements in which traces and fillers are co-indexed. The parser is an unlexicalized PCFG parser which is guara...
Helmut Schmid
ACL
2004
13 years 6 months ago
Using Linguistic Principles to Recover Empty Categories
This paper describes an algorithm for detecting empty nodes in the Penn Treebank (Marcus et al., 1993), finding their antecedents, and assigning them function tags, without access...
Richard Campbell
NAACL
2007
13 years 6 months ago
Language Modeling for Determiner Selection
We present a method for automatic determiner selection, based on an existing language model. We train on the Penn Treebank and also use additional data from the North American New...
Jenine Turner, Eugene Charniak