The vocabulary used in speech usually consists of two types of words: a limited set of common words, shared across multiple documents, and a virtually unlimited set of rare words, ...
Stefan Kombrink, Mirko Hannemann, Lukas Burget, Hy...
This paper presents the first stochastic finite-state morphological parser for Turkish. The non-probabilistic parser is a standard finite-state transducer implementation of two-le...
The technique of trace-based just-in-time compilation was introduced by Bala et al. and was further developed by Gal et al. It currently enjoys success in Mozilla Firefox’s Java...
Unlike traditional database queries, keyword queries do not adhere to predefined syntax and are often dirty with irrelevant words from natural languages. This makes accurate and e...
String-to-string transduction is a central problem in computational linguistics and natural language processing. It occurs in tasks as diverse as name transliteration, spelling co...