Sciweavers

331 search results - page 7 / 67
» Corpus studies in word prediction
Sort
View
CICLING
2005
Springer
15 years 5 months ago
Instance Pruning by Filtering Uninformative Words: An Information Extraction Case Study
In this paper we present a novel instance pruning technique for Information Extraction (IE). In particular, our technique filters out uninformative words from texts on the basis o...
Alfio Massimiliano Gliozzo, Claudio Giuliano, Raff...
ACL
2012
13 years 2 months ago
Syntactic Annotations for the Google Books NGram Corpus
We present a new edition of the Google Books Ngram Corpus, which describes how often words and phrases were used over a period of five centuries, in eight languages; it reflects...
Yuri Lin, Jean-Baptiste Michel, Erez Aiden Lieberm...
ACL
1994
15 years 1 months ago
Similarity-Based Estimation of Word Cooccurrence Probabilities
In many applications of natural language processing it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine...
Ido Dagan, Fernando C. N. Pereira, Lillian Lee
COLING
2000
15 years 1 months ago
Local context templates for Chinese constituent boundary prediction
: In this paper, we proposed a shallow syntactic knowledge description: constituent boundary representation and its simple and efficient prediction algorithm, based on different lo...
Qiang Zhou
EMNLP
2010
14 years 9 months ago
An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL
This paper proposes a fast and simple unsupervised word segmentation algorithm that utilizes the local predictability of adjacent character sequences, while searching for a leaste...
Valentin Zhikov, Hiroya Takamura, Manabu Okumura