Language model (LM) adaptation is often achieved by combining a generic LM with a topic-specific model that is more relevant to the target document. Unlike previous work on unsup...
The alignment of text line images with text transcript is a crucial step of handwritten document annotation. Handwritten text alignment is prone to errors due to the difficulty of...
In this paper, we present a new method for on-line Chinese character recognition that relies on an explicit description of characters structure. Contrary to most of known structur...
This paper describes a system for handwritten Chinese text recognition integrating language model. On a text line image, the system generates character segmentation and word segme...
This paper presents the Topic-Aspect Model (TAM), a Bayesian mixture model which jointly discovers topics and aspects. We broadly define an aspect of a document as a characteristi...