We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
Word sense discrimination is an unsupervised clustering problem, which seeks to discover which instances of a word/s are used in the same meaning. This is done strictly based on i...
This work applies boosted wrapper induction (BWI), a machine learning algorithm for information extraction from semi-structured documents, to the problem of named entity recogniti...
This paper presents a recognition system for isolated handwritten Bangla words, with a fixed lexicon, using a left-right Hidden Markov Model (HMM). A stochastic search method, nam...
Named Entity (NE) recognition from the results of Automatic Speech Recognition (ASR) is challenging because of ASR errors. To detect NEs, one of the options is to use a statistica...