We introduce factored language models (FLMs) and generalized parallel backoff (GPB). An FLM represents words as bundles of features (e.g., morphological classes, stems, data-drive...
Morphologically rich languages pose a challenge to the annotators of treebanks with respect to the status of orthographic (spacedelimited) words in the syntactic parse trees. In s...
A grammatical method of combining two kinds of speech repair cues is presented. One cue, prosodic disjuncture, is detected by a decision tree-based ensemble classifier that uses a...
John Hale, Izhak Shafran, Lisa Yung, Bonnie J. Dor...
We present an annotation tool for the extended textual coreference and the bridging anaphora in the Prague Dependency Treebank 2.0 (PDT 2.0). After we very briefly describe the an...