While collections—aggregation mechanisms such as folders, buddy lists, photo albums, etc.—clearly play a central role in information management, the potential benefits of true...
Besides the problem of searching for effective methods for extracting knowledge from large databases (KDD) there are some additional problems with handling ecological data, namely ...
Current Web search engines generally impose link analysis-based re-ranking on web-page retrieval. However, the same techniques, when applied directly to small web search such as i...
Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be clas...
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...