The omnipresence of unknown words is a problem that any NLP component needs to address in some form. While there exist many established techniques for dealing with unknown words i...
This paper presents an approach to text categorization that i) uses no machine learning and ii) reacts on-the-fly to unknown words. These features are important for categorizing B...
Words unknown to the lexicon present a substantial problem to part-of-speech tagging. In this paper we present a technique for fully unsupervised statistical acquisition of rules ...
Statistical machine learning methods are employed to train a Named Entity Recognizer from annotated data. Methods like Maximum Entropy and Conditional Random Fields make use of fe...
It is widely known that the quality of confidence measure is critical for speech applications. In this paper, we present our recent work on improving word confidence scores by cal...