Abstract. We present a linguistically-motivated sub-sentential alignment system that extends the intersected IBM Model 4 word alignments. The alignment system is chunk-driven and r...
This paper proposes a method for extracting bilingual text pairs from a comparable corpus. The basic idea of the method is to apply bootstrapping to an existing corpusbased cross-...
Hiroshi Masuichi, Raymond Flournoy, Stefan Kaufman...
A stakeholder is an individual, group, organization, or community that has an interest or stake in a consensus-building process. The goal of stakeholder identification is identify...
We present a simple algorithm for clustering semantic patterns based on distributional similarity and use cluster memberships to guide semi-supervised pattern discovery. We apply ...
Abstract. Geographic named entities can be classified into many subtypes that are useful for applications such as information extraction and question answering. In this paper, we ...