Sciweavers

735 search results - page 44 / 147
» Corpora and data preparation
Sort
View
SMC
2007
IEEE
15 years 4 months ago
ADtrees for sequential data and n-gram Counting
Abstract— We consider the problem of efficiently storing ngram counts for large n over very large corpora. In such cases, the efficient storage of sufficient statistics can ha...
Robert Van Dam, Dan Ventura
IJCAI
2001
14 years 11 months ago
Mining Soft-Matching Rules from Textual Data
Text mining concerns the discovery of knowledge from unstructured textual data. One important task is the discovery of rules that relate specific words and phrases. Although exist...
Un Yong Nahm, Raymond J. Mooney
IJDMB
2008
105views more  IJDMB 2008»
14 years 10 months ago
Temporal representation for gene networks: towards a qualitative temporal data mining
: Recently lots of studies aim at modeling and inferring gene networks. Modeling tools propose graphical models having almost nothing about time description of events and regards t...
Nicolas Turenne, Sylviane R. Schwer
NAACL
2010
14 years 8 months ago
Using Mostly Native Data to Correct Errors in Learners' Writing
We present results from a range of experiments on article and preposition error correction for non-native speakers of English. We first compare a language model and errorspecific ...
Michael Gamon
CICLING
2004
Springer
15 years 1 months ago
Language-Independent Methods for Compiling Monolingual Lexical Data
Abstract: In this paper we describe a flexible, portable and languageindependent infrastructure for setting up large monolingual language corpora. The approach is based on collecti...
Christian Biemann, Stefan Bordag, Gerhard Heyer, U...