Hand-coded scripts were used in the 1970-80s as knowledge backbones that enabled inference and other NLP tasks requiring deep semantic knowledge. We propose unsupervised induction...
This paper presents AnCora, a multilingual corpus annotated at different linguistic levels consisting of 500,000 words in Catalan (AnCora-Ca) and in Spanish (AnCora-Es). At presen...
The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language proc...
This paper describes a new program, correct, which takes words rejected by the Unix spell program, proposes a list of candidate corrections, and sorts them by probability. The pro...
Mark D. Kernighan, Kenneth Ward Church, William A....
This article focuses on why academic writers in computer science and sociology sometimes supply the reader with more details of citees' names than they need to: why citers na...