This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Search engines are among the most important applications or services on the web. Most existing successful search engines use global ranking algorithms to generate the ranking of do...
The Semantic Web is a new layer of the Internet that enables semantic representation of the contents of existing web pages. Using common ontologies, human users sketch out the mos...
Christian Fillies, Gay Wood-Albrecht, Frauke Weich...
Many of today's Web applications support just simple trial-anderror retrievals: supply one set of parameters, obtain one set of results. For a user who wants to examine a num...
Modern object-relational database systems are capable of managing multimedia data, e.g. image, video and audio. In this paper we study how such universal database systems can be u...