Sciweavers

COLING
2008

A Linguistic Knowledge Discovery Tool: Very Large Ngram Database Search with Arbitrary Wildcards

13 years 6 months ago
A Linguistic Knowledge Discovery Tool: Very Large Ngram Database Search with Arbitrary Wildcards
In this paper, we will describe a search tool for a huge set of ngrams. The tool supports queries with an arbitrary number of wildcards. It takes a fraction of a second for a search, and can provide the fillers of the wildcards. The system runs on a single Linux PC with reasonable size memory (less than 4GB) and disk space (less than 400GB). This system can be a very useful tool for linguistic knowledge discovery and other NLP tasks.
Satoshi Sekine
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Satoshi Sekine
Comments (0)