Sciweavers

AAAI
1990

Parsing a Natural Language Using Mutual Information Statistics

13 years 5 months ago
Parsing a Natural Language Using Mutual Information Statistics
The purpose of this paper is to characterize a constituent boundary parsing algorithm, using an information-theoretic measure called generalized mutual information, which serves as an alternative to traditional grammar-based parsing methods. This method is based on the hypothesis that constituent boundaries can be extracted from a given sentence (or word sequence) by analyzing the mutual information values of the part-ofspeech n-grams within the sentence. This hypothesis is supported by the performance of an implementation of this parsing algorithm which determines a recursive unlabeled bracketing of unrestricted English text with a relatively low error rate. This paper derives the generalized mutual information statistic, describes the parsing algorithm, and presents results and sample output from the parser.
David M. Magerman, Mitchell P. Marcus
Added 06 Nov 2010
Updated 06 Nov 2010
Type Conference
Year 1990
Where AAAI
Authors David M. Magerman, Mitchell P. Marcus
Comments (0)