Sciweavers

155 search results - page 3 / 31
» acl 2012
Sort
View
ACL
2012
11 years 7 months ago
Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation
We present a novel extension to a recently proposed incremental learning algorithm for the word segmentation problem originally introduced in Goldwater (2006). By adding rejuvenat...
Benjamin Börschinger, Mark Johnson
ACL
2012
11 years 7 months ago
Word Epoch Disambiguation: Finding How Words Change Over Time
In this paper we introduce the novel task of “word epoch disambiguation,” defined as the problem of identifying changes in word usage over time. Through experiments run using...
Rada Mihalcea, Vivi Nastase
ACL
2012
11 years 7 months ago
Tokenization: Returning to a Long Solved Problem - A Survey, Contrastive Experiment, Recommendations, and Toolkit -
We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank style, and present a new rule-based preprocessing toolkit that not only reproduces the Tr...
Rebecca Dridan, Stephan Oepen
ACL
2012
11 years 7 months ago
MIX Is Not a Tree-Adjoining Language
The language MIX consists of all strings over the three-letter alphabet {a, b, c} that contain an equal number of occurrences of each letter. We prove Joshi’s (1985) conjecture ...
Makoto Kanazawa, Sylvain Salvati
ACL
2012
11 years 7 months ago
Movie-DiC: a Movie Dialogue Corpus for Research and Development
This paper describes Movie-DiC a Movie Dialogue Corpus recently collected for research and development purposes. The collected dataset comprises 132,229 dialogues containing a tot...
Rafael E. Banchs