Splitting Long or Ill-formed Input for Robust Spoken-language Translation

13 years 5 months ago

Download www.mt-archive.info

This paper proposes an input-splitting method for translating spoken-language which includes many long or ill-formed expressions. The proposed method splits input into well-balanced translation units based on a semantic distance calculation. The splitting is performed during left-to-right parsing, and does not degrade translation efficiency. The complete translation result is formed by concatenating the partial translation results of each split unit. The proposed method can be incorporated into frameworks like TDMT, which utilize left-to-right parsing and a score for a substructure. Experimental results show that the proposed method gives TDMT the followingadvantages: (1) elimination of null outputs, (2) splitting of utterances into sentences, and (3) robust translation of erroneous speech recognition results.

Osamu Furuse, Setsuo Yamada, Kazuhide Yamamoto

Real-time Traffic

ACL 1998 | ACL 2007 | Input-splitting Method | Left-to-right Parsing | Well-balanced Translation Units |

claim paper

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	1998
Where	ACL
Authors	Osamu Furuse, Setsuo Yamada, Kazuhide Yamamoto

Sciweavers

Splitting Long or Ill-formed Input for Robust Spoken-language Translation

ACL 1998 | ACL 2007 | Input-splitting Method | Left-to-right Parsing | Well-balanced Translation Units |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers