As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
Many large-scale Web applications that require ranked top-k retrieval are implemented using inverted indices. An inverted index represents a sparse term-document matrix, where non...
George Beskales, Marcus Fontoura, Maxim Gurevich, ...
This paper discusses extensions to the previously developed “essentiality and proficiency” approach to increasing usability and accessibility of websites. The existing approa...
Matthew T. Atkinson, Jatinder Dhiensa, Colin H. C....
Retrieval in historic documents with non-standard spelling requires a mapping from search terms onto the historic terms in the document. For describing this mapping, we have develo...
In business-to-business e-commerce, traditional electronic data interchange (EDI) approaches such as UN/EDIFACT have been superseded by approaches like web services and ebXML. Nev...