This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe...
Jinhan Kim, Long Jiang, Seung-won Hwang, Young-In ...
e by placing terms in an abstract ‘information space’ based on their occurrences in text corpora, and then allowing a user to visualize local regions of this information space....
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a...
Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas L...
We apply pattern-based methods for collecting hypernym relations from the web. We compare our approach with hypernym extraction from morphological clues and from large text corpor...
The phonemic statistics were collected from several large Polish corpora. The paper presents methodology of the acquisition process, summarisation of the data and some phenomena in...