How Verb Subcategorization Frequencies Are Affected By Corpus Choice

13 years 8 months ago
How Verb Subcategorization Frequencies Are Affected By Corpus Choice
The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute them from psychological studies (sentence production and completion tasks). Recent studies have found differences between corpus frequencies and psycholinguistic measures. We analyze subcategorization frequencies from four different corpora: psychological sentence production data (Connine et al. 1984), written text (Brown and WSJ), and telephone conversation data (Switchboard). We find two different sources for the differences. Discourse influence is a result of how verb use is affected by different discourse types such as narrative, connected discourse, and single sentence productions. Semantic...
Douglas Roland, Daniel Jurafsky
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 1998
Where ACL
Authors Douglas Roland, Daniel Jurafsky
Comments (0)