We present a probabilistic model for generating personalised recommendations of items to users of a web service. The Matchbox system makes use of content information in the form o...
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
The term "Web 2.0" is used to describe applications that distinguish themselves from previous generations of software by a number of principles. Existing work shows that...
Finding relationships between entities on the Web, e.g., the connections between different places or the commonalities of people, is a novel and challenging problem. Existing Web ...