Sciweavers

26 search results - page 5 / 6
» Modeling Morphologically Rich Languages Using Split Words an...
Sort
View
ACL
2009
13 years 2 months ago
Part of Speech Tagger for Assamese Text
Assamese is a morphologically rich, agglutinative and relatively free word order Indic language. Although spoken by nearly 30 million people, very little computational linguistic ...
Navanath Saharia, Dhrubajyoti Das, Utpal Sharma, J...
LREC
2010
195views Education» more  LREC 2010»
13 years 5 months ago
Adapting Chinese Word Segmentation for Machine Translation Based on Short Units
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...
Yiou Wang, Kiyotaka Uchimoto, Jun'ichi Kazama, Can...
TSD
2007
Springer
13 years 10 months ago
Accurate Unlexicalized Parsing for Modern Hebrew
Many state-of-the-art statistical parsers for English can be viewed as Probabilistic Context-Free Grammars (PCFGs) acquired from treebanks consisting of phrase-structure trees enri...
Reut Tsarfaty, Khalil Sima'an
NAACL
2003
13 years 5 months ago
TIPS: A Translingual Information Processing System
Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, ...
Yaser Al-Onaizan, Radu Florian, Martin Franz, Hany...
ACL
2003
13 years 5 months ago
Accurate Unlexicalized Parsing
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down fa...
Dan Klein, Christopher D. Manning