Abstract. We investigate a generative latent variable model for modelbased word saliency estimation for text modelling and classification. The estimation algorithm derived is able ...
In this paper we look at the problem of cleansing noisy text using a statistical machine translation model. Noisy text is produced in informal communications such as Short Message...
Danish Contractor, Tanveer A. Faruquie, L. Venkata...
The alignment of text line images with text transcript is a crucial step of handwritten document annotation. Handwritten text alignment is prone to errors due to the difficulty of...
This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapp...
Husni A. Al-Muhtaseb, Sabri A. Mahmoud, Rami Qahwa...
Topic modeling techniques have widespread use in text data mining applications. Some applications use batch models, which perform clustering on the document collection in aggregat...