Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic meas...
Ana Gabriela Maguitman, Filippo Menczer, Heather R...
In this paper, we introduce an implementation for detailed monitoring of user actions on web pages. It addresses the problem that the log data recorded by standard web servers is ...
Document-centric XML collections contain text-rich documents, marked up with XML tags that add lightweight semantics to the text. Querying such collections calls for a hybrid quer...
With the proliferation of online distribution methods for videos, content owners require easier and more effective methods for monetization through advertising. Matching advertis...