Sciweavers

EMNLP
2009

Collocation Extraction Using Monolingual Word Alignment Method

13 years 2 months ago
Collocation Extraction Using Monolingual Word Alignment Method
Statistical bilingual word alignment has been well studied in the context of machine translation. This paper adapts the bilingual word alignment algorithm to monolingual scenario to extract collocations from monolingual corpus. The monolingual corpus is first replicated to generate a parallel corpus, where each sentence pair consists of two identical sentences in the same language. Then the monolingual word alignment algorithm is employed to align the potentially collocated words in the monolingual sentences. Finally the aligned word pairs are ranked according to refined alignment probabilities and those with higher scores are extracted as collocations. We conducted experiments using Chinese and English corpora individually. Compared with previous approaches, which use association measures to extract collocations from the co-occurring word pairs within a given window, our method achieves higher precision and recall. According to human evaluation in terms of precision, our method achie...
Zhan-yi Liu, Haifeng Wang, Hua Wu, Sheng Li
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Where EMNLP
Authors Zhan-yi Liu, Haifeng Wang, Hua Wu, Sheng Li
Comments (0)