Sciweavers

118 search results - page 1 / 24
» Scaling Distributional Similarity to Large Corpora
Sort
View
ACL
2006
13 years 6 months ago
Scaling Distributional Similarity to Large Corpora
Accurately representing synonymy using distributional similarity requires large volumes of data to reliably represent infrequent words. However, the na
James Gorman, James R. Curran
LREC
2008
95views Education» more  LREC 2008»
13 years 6 months ago
Using Similarity Measures to Extend the LinGO Lexicon
Deep processing of natural language requires large scale lexical resources that have sufficient coverage at a sufficient level of detail and accuracy (i.e. both recall and precisi...
Lynne J. Cahill
ACL
2008
13 years 6 months ago
Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
Jakob Uszkoreit, Thorsten Brants
TON
2010
161views more  TON 2010»
12 years 11 months ago
Investigating Self-Similarity and Heavy-Tailed Distributions on a Large-Scale Experimental Facility
Abstract--After the seminal work by Taqqu et al. relating selfsimilarity to heavy-tailed distributions, a number of research articles verified that aggregated Internet traffic time...
Patrick Loiseau, Paulo Gonçalves, Guillaume...
COLING
2010
12 years 11 months ago
Large Scale Parallel Document Mining for Machine Translation
A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...