We demonstrate an approach for inducing a tagger for historical languages based on existing resources for their modern varieties. Tags from Present Day English source text are pro...
Abundant Chinese paraphrasing resource on Internet can be attained from different Chinese translations of one foreign masterpiece. Paraphrases corpus is the corpus that includes s...
Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, wit...
Ralf Steinberger, Bruno Pouliquen, Anna Widiger, C...
Abstract. The multiple sequence alignment problem (MSA) can be reformulated as the problem of finding a maximum weight trace in an alignment graph, which is derived from all pairw...