Sciweavers

735 search results - page 32 / 147
» Corpora and data preparation
Sort
View
VLDB
2001
ACM
87views Database» more  VLDB 2001»
15 years 2 months ago
Navigating large-scale semi-structured data in business portals
This paper presents several paradigms by which users of Verity business portals (from within as well as from outside an enterprise) discover and navigate relevant semistructured d...
Mani Abrol, Neil Latarche, Uma Mahadevan, Jianchan...
DCC
2011
IEEE
14 years 5 months ago
Deplump for Streaming Data
We present a general-purpose, lossless compressor for streaming data. This compressor is based on the deplump probabilistic compressor for batch data. Approximations to the infere...
Nicholas Bartlett, Frank Wood
LREC
2008
95views Education» more  LREC 2008»
14 years 11 months ago
Dialogue, Speech and Images: the Companions Project Data Set
This paper describes part of the corpus collection efforts underway in the EC funded Companions project. The Companions project is collecting substantial quantities of dialogue a ...
Yorick Wilks, David Benyon, Christopher Brewster, ...
ALIFE
1999
14 years 9 months ago
An Approach to Biological Computation: Unicellular Core-Memory Creatures Evolved Using Genetic Algorithms
A novel machine language genetic programming system that uses one-dimensional core memories is proposed and simulated. The core is compared to a biochemical reaction space, and in ...
Hikeaki Suzuki
LREC
2008
138views Education» more  LREC 2008»
14 years 11 months ago
Cleaneval: a Competition for Cleaning Web Pages
Cleaneval is a shared task and competitive evaluation on the topic of cleaning arbitrary web pages, with the goal of preparing web data for use as a corpus for linguistic and lang...
Marco Baroni, Francis Chantree, Adam Kilgarriff, S...