: This work focuses on clustering a site into groups of documents that are predictive of future user accesses. Two approaches have been developed and tested. The first approach use...
Poor literacy remains a decisive barrier to the economic empowerment of many people in the developing world. Of particular importance is literacy in a widely spoken "world la...
Matthew Kam, Divya Ramachandran, Varun Devanathan,...
Compound (or mixed) document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, web-sites etc....
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify te...
It is necessary to provide a method to store Web information effectively so it can be utilised as a future knowledge resource. A commonly adopted approach is to classify the retri...