Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

15

IAJIT
2011

favoriteEmaildiscussreport

268views Distributed And Parallel Com...» more IAJIT 2011»

Improving the accuracy of English-Arabic statistical sentence alignment

12 years 11 months ago

Improving the accuracy of English-Arabic statistical sentence alignment

Download www.ccis2k.org

: Multilingual natural language processing systems are increasingly relying on parallel corpus to ameliorate their output. Parallel corpora constitute the basic block for training a statistical natural language processing system and creating translation and language models. Several systems have been devised that automatically align words of a pair of sentences, each in a language. Such systems have been used successfully with European languages. In this paper, one such system is used to align sentences in an English-Arabic corpus. The system works poorly given raw unaligned sentence English-Arabic sentence pairs. This prompted the development of a preprocessing step to be applied to the Arabic sentences. The same corpus was then preprocessed and a significant improvement is reported when alignment is attempted using the preprocessed unaligned sentences.

Mohammad Salameh, Rached Zantout, Nashat Mansour

Real-time Traffic

Distributed And Parallel Computing | IAJIT 2011 | Natural Language Processing | Statistical Natural Language Processing | Unaligned Sentences |

claim paper

Related Content

» Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model

» Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment

» Improved Word Alignment with Statistics and Linguistic Heuristics

» Reliable Measures for Aligning JapaneseEnglish News Articles and Sentences

» Unsupervised Learning of Arabic Stemming Using a Parallel Corpus

» Statistical Machine Translation with Word and SentenceAligned Parallel Corpora

» A Probability Model to Improve Word Alignment

» Improved Unsupervised Sentence Alignment for Symmetrical and Asymmetrical Parallel Corpora

» Bilingual Text Matching using Bilingual Dictionary and Statistics

Post Info
More Details (n/a)

Added	14 May 2011
Updated	14 May 2011
Type	Journal
Year	2011
Where	IAJIT
Authors	Mohammad Salameh, Rached Zantout, Nashat Mansour

Comments (0)