We investigate the novel problem of event recognition from news webpages. "Events" are basic text units containing news elements. We observe that a news article is always...
Abstract. Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching on...
Gopal Datt Joshi, Saurabh Garg, Jayanthi Sivaswamy
Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and ...
This paper introduces our efforts to create UPX, an XML-based successor to the venerable UNIPEN format for the representation of annotated datasets of online handwriting data. In ...
The Web continues to grow at a tremendous rate. Search engines find it increasingly difficult to provide useful results. To manage this explosively large number of Web documents,...
Sandip Debnath, Tracy Mullen, Arun Upneja, C. Lee ...