We present a stochastic finite-state model for segmenting Chinese text into dictionary entries and productively derived words, and providing pronunciations for these words; the me...
Richard Sproat, Chilin Shih, William Gale, Nancy C...
In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps Coarse text ...
Jayant Kumar, Wael Abd-Almageed, Le Kang, David S....
Automatic Term recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies ...
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
In order to take steps towards establishing a methodology for evaluating Natural Language systems, we conducted a case study. We attempt to evaluate two different approaches to an...