In this paper, we explore a CLIR-based approach to construct large-scale Chinese-English comparable corpora, which is valuable for translation knowledge mining. The initial source...
We present a sub-sentential alignment system that links linguistically motivated phrases in parallel texts based on lexical correspondences and syntactic similarity. We compare th...
Background: Growing interest in the application of natural language processing methods to biomedical text has led to an increasing number of corpora and methods targeting protein-...
Sampo Pyysalo, Antti Airola, Juho Heimonen, Jari B...
In this paper, we will address term translation extraction from indexed aligned parallel corpora, by using a couple of association measures combined by a voting scheme, for scaling...
We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of ...