Sciweavers

38 search results - page 8 / 8
» Navigating multilingual news collections using automatically...
Sort
View
WWW
2008
ACM
14 years 5 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
ICDM
2008
IEEE
186views Data Mining» more  ICDM 2008»
13 years 11 months ago
xCrawl: A High-Recall Crawling Method for Web Mining
Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The first step in the Information Extract...
Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...
SIGIR
2005
ACM
13 years 10 months ago
An industrial-strength content-based music recommendation system
We present a metadata free system for the interaction with massive collections of music, the MusicSurfer. MusicSurfer automatically extracts descriptions related to instrumentatio...
Pedro Cano, Markus Koppenberger, Nicolas Wack