Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
Throughout its history, AI researchers have alternatively seen their mission as producing computer behavior that is indistinguishable from that of humans or as providing computati...
The advent of XML as a universal exchange format, and of Web services as a basis for distributed computing, has fostered the apparition of a new class of documents: dynamic XML do...
Interoperability is one of the main issues in creating a networked system of repositories The approaches range from simply forcing one metadata standard on all participating repos...
Marek Hatala, Griff Richards, Timmy Eap, Jordan Wi...
Breaking news often contains timely definitions and descriptions of current terms, organizations and personalities. We utilize such web sources to construct definitions for such t...