Sciweavers

735 search results - page 21 / 147
» Corpora and data preparation
Sort
View
CVPR
2010
IEEE
15 years 6 months ago
Connecting Modalities: Semi-supervised Segmentation and Annotation of Images Using Unaligned Text Corpora
We propose a semi-supervised model which segments and annotates images using very few labeled images and a large unaligned text corpus to relate image regions to text labels. Give...
Richard Socher, Li Fei-Fei
IJCNLP
2005
Springer
15 years 3 months ago
Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora
Abstract. We present a new implication of Wu’s (1997) Inversion Transduction Grammar (ITG) Hypothesis, on the problem of retrieving truly parallel sentence translations from larg...
Dekai Wu, Pascale Fung
LREC
2010
216views Education» more  LREC 2010»
14 years 11 months ago
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
Georgios Petasis, Dimitrios Petasis
LREC
2010
208views Education» more  LREC 2010»
14 years 11 months ago
Extraction of German Multiword Expressions from Parsed Corpora Using Context Features
We report about tools for the extraction of German multiword expressions (MWEs) from text corpora; we extract word pairs, but also longer MWEs of different patterns, e.g. verb-nou...
Marion Weller, Ulrich Heid
LREC
2010
159views Education» more  LREC 2010»
14 years 11 months ago
Towards Optimal TTS Corpora
Unit selection text-to-speech systems currently produce very natural synthesized phrases by concatenating speech segments from a large database. Recently, increasing demand for de...
Didier Cadic, Cédric Boidin, Christophe d'A...