Sciweavers

735 search results - page 119 / 147
» Corpora and data preparation
Sort
View
WWW
2007
ACM
15 years 10 months ago
Towards domain-independent information extraction from web tables
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...
WWW
2003
ACM
15 years 10 months ago
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
This paper describes Seeker, a platform for large-scale text analytics, and SemTag, an application written on the platform to perform automated semantic tagging of large corpora. ...
Stephen Dill, Nadav Eiron, David Gibson, Daniel Gr...
CICLING
2010
Springer
15 years 4 months ago
Towards Automatic Detection and Tracking of Topic Change
We present an approach for automatic detection of topic change. Our approach is based on the analysis of statistical features of topics in time-sliced corpora and their dynamics ov...
Florian Holz, Sven Teresniak
ICASSP
2009
IEEE
15 years 4 months ago
Comparing maximum a posteriori vector quantization and Gaussian mixture models in speaker verification
Gaussian mixture model - universal background model (GMMUBM) is a standard reference classifier in speaker verification. We have recently proposed a simplified model using vect...
Tomi Kinnunen, Juhani Saastamoinen, Ville Hautam&a...
ICASSP
2008
IEEE
15 years 4 months ago
Learning with noisy supervision for Spoken Language Understanding
Data-driven Spoken Language Understanding (SLU) systems need semantically annotated data which are expensive, time consuming and prone to human errors. Active learning has been su...
Christian Raymond, G. Riccardfi