Abstract. Documents written in languages other than English sometimes include parenthetical English translations, usually for technical and scientic terminology. Techniques had be...
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
The work presented in this paper explores a supervised method for learning a probabilistic model of a lexicon of VerbNet classes. We intend for the probabilistic model to provide ...
This paper proposes a method for extracting bilingual text pairs from a comparable corpus. The basic idea of the method is to apply bootstrapping to an existing corpusbased cross-...
Hiroshi Masuichi, Raymond Flournoy, Stefan Kaufman...
: We describe methods for automatically identifying signature blocks and reply lines in plaintext email messages. This analysis has many potential applications, such as preprocessi...