— This paper presents the prototype of an expert peering system for information exchange in the knowledge society. Our system realizes an intelligent, real-time search engine for...
Christian Bauckhage, Tansu Alpcan, Sachin Agarwal,...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...
Hongyu Liu, Evangelos E. Milios, Jeannette Janssen
Similarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Near...