In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing f...
In this paper, stochastic error-correcting parsing is proposed as a powerful and flexible method to post-process the results of an optical character recognizer (OCR). Determinist...
Recent models of natural language processing employ statistical reasoning for dealing with the ambiguity of formal grammars. In this approach, statistics, concerning the various li...
Information Retrieval (IR) is a major component in many of our daily activities, with perhaps its most prominent role manifested in search engines. Today’s most advanced engines...
Abstract. Since the early days of generation research, it has been acknowledged that modeling the global structure of a document is crucial for producing coherent, readable output....