We present a nearest-neighbor algorithm for resolving prepositional phrase attachment ambiguities. Its performance is significantly higher than previous corpus-based methods for P...
Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this pap...
We study the problem of efficiently removing equal frequency n-gram substrings from an n-gram set, formally called Statistical Substring Reduction (SSR). SSR is a useful operatio...
In a text categorization task, classification on some hierarchy of classes shows better results than the case without the hierarchy. In current environments where large amount of ...
In this paper, we propose two indepentent solutions to resolve the problems of many different spelling variants and of lack of annotated corpus for training, which are two main di...
In this paper we present a noun phrase coreference resolution system which aims to enhance the identification of the coreference realized by string matching. For this purpose, we ...
Xiaofeng Yang, Guodong Zhou, Jian Su, Chew Lim Tan
The paper discusses two policies for recognizing NEs with complex structures by maximum entropy models. One policy is to develop cascaded MaxEnt models at different levels. The ot...
This paper presents an iterative CKY parsing algorithm for probabilistic contextfree grammars (PCFG). This algorithm enables us to prune unnecessary edges produced during parsing,...
In this paper, a series of window-based methods is proposed for information retrieval. Compared with traditional tf-idf model, our approaches are based on two new key notions. The ...