Sciweavers

2372 search results - page 58 / 475
» is 2012
Sort
View
ACL
2012
13 years 8 months ago
Tokenization: Returning to a Long Solved Problem - A Survey, Contrastive Experiment, Recommendations, and Toolkit -
We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank style, and present a new rule-based preprocessing toolkit that not only reproduces the Tr...
Rebecca Dridan, Stephan Oepen
ACL
2012
13 years 8 months ago
Learning Better Rule Extraction with Translation Span Alignment
This paper presents an unsupervised approach to learning translation span alignments from parallel data that improves syntactic rule extraction by deleting spurious word alignment...
Jingbo Zhu, Tong Xiao, Chunliang Zhang
ACL
2012
13 years 8 months ago
MIX Is Not a Tree-Adjoining Language
The language MIX consists of all strings over the three-letter alphabet {a, b, c} that contain an equal number of occurrences of each letter. We prove Joshi’s (1985) conjecture ...
Makoto Kanazawa, Sylvain Salvati
ACL
2012
13 years 8 months ago
Word Epoch Disambiguation: Finding How Words Change Over Time
In this paper we introduce the novel task of “word epoch disambiguation,” defined as the problem of identifying changes in word usage over time. Through experiments run using...
Rada Mihalcea, Vivi Nastase
ACL
2012
13 years 8 months ago
Spectral Learning of Latent-Variable PCFGs
We introduce a spectral learning algorithm for latent-variable PCFGs (Petrov et al., 2006). Under a separability (singular value) condition, we prove that the method provides cons...
Shay B. Cohen, Karl Stratos, Michael Collins, Dean...