: User interfaces to WWW search engines typically present results as ranked lists of documents. Such lists give users little help in understanding document variation: we propose a ...
Ivan Bretan, Johan Dewe, Anders Hallberg, Niklas W...
While the Web makes an increasing number of ontologies widely available for applications, how to discover ontologies becomes a more challenging issue. Existing approaches are mainl...
In previous research it has been shown that link-based web page metrics can be used to predict experts’ assessment of quality. We are interested in a related question: do expert...
Even prior to content, the genre of a web document leads to a first coarse binary classification of the recall space in relevant and non-relevant documents. Thinking of a genre se...
Andrea Stubbe, Christoph Ringlstetter, Randy Goebe...
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...