Linking wikipedia to the web

11 years 3 months ago
Linking wikipedia to the web
We investigate the task of finding links from Wikipedia pages to external web pages. Such external links significantly extend the information in Wikipedia with information from the Web at large, while retaining the encyclopedic organization of Wikipedia. We use a language modeling approach to create a full-text and anchor text runs, and experiment with different document priors. In addition we explore whether social bookmarking site Delicious can be exploited to further improve our performance. We have constructed a test collection of 53 topics, which are Wikipedia pages on different entities. Our findings are that the anchor text index is a very effective method to retrieve home pages. Url class and anchor text length priors and their combination leads to the best results. Using Delicious on its own does not lead to very good results, but it does contain valuable information. Combining the best anchor text run and the Delicious run leads to further improvements. Categories and Sub...
Rianne Kaptein, Pavel Serdyukov, Jaap Kamps
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2010
Authors Rianne Kaptein, Pavel Serdyukov, Jaap Kamps
Comments (0)