Topic distillation aims at finding key resources which are high-quality pages for certain topics. With analysis in non-content features of key resources, a pre-selection method is ...
This paper retains the experiences by participating in TREC 2007 Blog Track ‘Feed Distillation’. To perform the run various classifiers are combined, which analyze title-, co...
Wai-Lung Lee, Andreas Lommatzsch, Christian Scheel
We describe the architecture of a hypertext resource discovery system using a relational database. Such a system can answer questions that combine page contents, metadata, and hyp...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
The Hypertext-based Webs such as Intranets contain a vast amount of information pertaining to an enormous number of subjects. It is, however, an organically grown and thus essentia...
Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyp...