In this paper, we tackle the problem of localizing graphical symbols on complex technical document images by using an original approach to solve the subgraph isomorphism problem. ...
Machine recognition of hand-filled forms is a challenging task. Form processing involves many activities including form field location, field frame boundary removal and data image...
We investigate the novel problem of event recognition from news webpages. "Events" are basic text units containing news elements. We observe that a news article is always...
The paper proposes an approach to modeling users of large Web sites based on combining different data sources: access logs and content of the accessed pages are combined with sema...
—Libraries in South Asia hold huge collections of valuable printed documents in Urdu and it is of interest to digitize these collections to make them more accessible. The unavail...