In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Botnets are large groups of compromised machines (bots) used by miscreants for the most illegal activities (e.g., sending spam emails, denial-of-service attacks, phishing and other...
Emanuele Passerini, Roberto Paleari, Lorenzo Marti...
When search results against digital libraries and web resources have limited metadata, augmenting them with meaningful and stable category information can enable better overviews ...
Embedded systems such as smart cards or sensors are now widespread, but are often closed systems, only accessed via dedicated terminals. A new trend consists in embedding Web serv...
Simon Duquennoy, Gilles Grimaud, Jean-Jacques Vand...
Web prefetching techniques have pointed to be especially important to reduce web latencies and, consequently, an important set of works can be found in the open literature. But, in...