Sciweavers

CIKM
2009
Springer

Completing wikipedia's hyperlink structure through dimensionality reduction

13 years 11 months ago
Completing wikipedia's hyperlink structure through dimensionality reduction
Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyperlinks. However, since these links are created by human authors, links one would expect to see are often missing. The goal of this work is to detect such gaps automatically. In this paper, we propose a novel method for augmenting the structure of hyperlinked document collections such as Wikipedia. It does not require the extraction of any manually defined features from the article to be augmented. Instead, it is based on principal component analysis, a well-founded mathematical generalization technique, and predicts new links purely based on the statistical structure of the graph formed by the existing links. Our method does not rely on the textual content of articles; we are exploiting only hyperlinks. A user evaluation of our technique shows that it improves the quality of top link suggestions over the st...
Robert West, Doina Precup, Joelle Pineau
Added 26 May 2010
Updated 26 May 2010
Type Conference
Year 2009
Where CIKM
Authors Robert West, Doina Precup, Joelle Pineau
Comments (0)