Sciweavers
Explore
Publications
Books
Software
Tutorials
Presentations
Lectures Notes
Datasets
Labs
Conferences
Community
Upcoming
Conferences
Top Ranked Papers
Most Viewed Conferences
Conferences by Acronym
Conferences by Subject
Conferences by Year
Tools
PDF Tools
Image Tools
Text Tools
OCR Tools
Symbol and Emoji Tools
On-screen Keyboard
Latex Math Equation to Image
Smart IPA Phonetic Keyboard
Community
Sciweavers
About
Terms of Use
Privacy Policy
Cookies
133
Voted
ACL
2012
191
views
Computational Linguistics
»
more
ACL 2012
»
Tokenization: Returning to a Long Solved Problem - A Survey, Contrastive Experiment, Recommendations, and Toolkit -
13 years 5 months ago
Download
aclweb.org
We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank style, and present a new rule-based preprocessing toolkit that not only reproduces the Tr...
Rebecca Dridan, Stephan Oepen
claim paper
Read More »