Sciweavers

JMLR
2002

Shallow Parsing using Noisy and Non-Stationary Training Material

13 years 4 months ago
Shallow Parsing using Noisy and Non-Stationary Training Material
Shallow parsers are usually assumed to be trained on noise-free material, drawn from the same distribution as the testing material. However, when either the training set is noisy or else drawn from a different distributions, performance may be degraded. Using the parsed Wall Street Journal, we investigate the performance of four shallow parsers (maximum entropy, memory-based learning, N-grams and ensemble learning) trained using various types of artificially noisy material. Our first set of results show that shallow parsers are surprisingly robust to synthetic noise, with performance gradually decreasing as the rate of noise increases. Further results show that no single shallow parser performs best in all noise situations. Final results show that simple, parser-specific extensions can improve noise-tolerance. Our second set of results addresses the question of whether naturally occurring disfluencies undermines performance more than does a change in distribution. Results using the pa...
Miles Osborne
Added 22 Dec 2010
Updated 22 Dec 2010
Type Journal
Year 2002
Where JMLR
Authors Miles Osborne
Comments (0)