In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that th...
In this paper, we study the problem of using an annotated corpus in English for the same natural language processing task in another language. While various machine translation sy...
We address the issue of using heterogeneous treebanks for parsing by breaking it down into two sub-problems, converting grammar formalisms of the treebanks to the same one, and pa...
An important and well-studied problem is the production of semantic lexicons from a large corpus. In this paper, we present a system named ASIA (Automatic Set Instance Acquirer), ...
We employ Maximum Entropy model to conduct sub-tree alignment between bilingual phrasal structure trees. Various lexical and structural knowledge is explored to measure the syntac...