Search Sciweavers | Sciweavers

17

ACL
2012

185views Computational Linguistics» more ACL 2012»

Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation

11 years 7 months ago

We present a novel extension to a recently proposed incremental learning algorithm for the word segmentation problem originally introduced in Goldwater (2006). By adding rejuvenat...

Benjamin Börschinger, Mark Johnson

claim paper

Read More »

25

click to vote

ACL
2012

194views Computational Linguistics» more ACL 2012»

Word Epoch Disambiguation: Finding How Words Change Over Time

11 years 7 months ago

Download aclweb.org

In this paper we introduce the novel task of “word epoch disambiguation,” deﬁned as the problem of identifying changes in word usage over time. Through experiments run using...

Rada Mihalcea, Vivi Nastase

claim paper

Read More »

17

click to vote

ACL
2012

191views Computational Linguistics» more ACL 2012»

Tokenization: Returning to a Long Solved Problem - A Survey, Contrastive Experiment, Recommendations, and Toolkit -

11 years 7 months ago

Download aclweb.org

We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank style, and present a new rule-based preprocessing toolkit that not only reproduces the Tr...

Rebecca Dridan, Stephan Oepen

claim paper

Read More »

12

click to vote

ACL
2012

196views Computational Linguistics» more ACL 2012»

MIX Is Not a Tree-Adjoining Language

11 years 7 months ago

Download aclweb.org

The language MIX consists of all strings over the three-letter alphabet {a, b, c} that contain an equal number of occurrences of each letter. We prove Joshi’s (1985) conjecture ...

Makoto Kanazawa, Sylvain Salvati

claim paper

Read More »

23

click to vote

ACL
2012

342views Computational Linguistics» more ACL 2012»

Movie-DiC: a Movie Dialogue Corpus for Research and Development

11 years 7 months ago

Download aclweb.org

This paper describes Movie-DiC a Movie Dialogue Corpus recently collected for research and development purposes. The collected dataset comprises 132,229 dialogues containing a tot...

Rafael E. Banchs

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers