Memory-based active learning for French broadcast news

13 years 1 months ago

Download www.loria.fr

Stochastic dependency parsers can achieve very good results when they are trained on large corpora that have been manually annotated. Active learning is a procedure that aims at reducing this annotation cost by selecting as few sentences as possible that will produce the best possible parser. We propose a new selective sampling function for Active Learning that exploits two memory-based distances to find a good compromise between parser uncertainty and sentence representativeness. The reduced dependency between both parsing and selection models opens interesting perspectives for future models combination. The approach is validated on a French broadcast news corpus creation task dedicated to dependency parsing. It outperforms the baseline uncertainty entropy-based selective sampling on this task. We plan to extend this work with self- and co-training methods in order to enlarge this corpus and produce the first French broadcast news Tree Bank.

Frédéric Tantini, Christophe Cerisar

Real-time Traffic

Active Learning | INTERSPEECH 2010 | Parser | Signal Processing | Stochastic Dependency Parsers |

claim paper

» Reshaping automatic speech transcripts for robust highlevel spoken document analysis

Post Info
More Details (n/a)

Added	18 May 2011
Updated	18 May 2011
Type	Journal
Year	2010
Where	INTERSPEECH
Authors	Frédéric Tantini, Christophe Cerisara, Claire Gardent

Comments (0)

Sciweavers

Memory-based active learning for French broadcast news

Active Learning | INTERSPEECH 2010 | Parser | Signal Processing | Stochastic Dependency Parsers |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers