Sciweavers

611 search results - page 62 / 123
» Random web crawls
Sort
View
SAC
2005
ACM
15 years 3 months ago
A distributed content-based search engine based on mobile code
Current search engines crawl the Web, download content, and digest this content locally. For multimedia content, this involves considerable volumes of data. Furthermore, this proc...
Volker Roth, Ulrich Pinsdorf, Jan Peters
ERCIMDL
2005
Springer
124views Education» more  ERCIMDL 2005»
15 years 3 months ago
A Comparison of On-Line Computer Science Citation Databases
This paper examines the difference and similarities between the two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manual...
Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G....
FC
2010
Springer
160views Cryptology» more  FC 2010»
15 years 1 months ago
Measuring the Perpetrators and Funders of Typosquatting
We describe a method for identifying “typosquatting”, the intentional registration of misspellings of popular website addresses. We estimate that at least 938 000 typosquatting...
Tyler Moore, Benjamin Edelman
FEGC
2006
92views Biometrics» more  FEGC 2006»
14 years 11 months ago
Maintaining an Online Bibliographical Database: The Problem of Data Quality
CiteSeer and Google-Scholar are huge digital libraries which provide access to (computer-)science publications. Both collections are operated like specialized search engines, they ...
Michael Ley, Patrick Reuther
PPL
2008
140views more  PPL 2008»
14 years 9 months ago
An Importance-Aware Architecture for Large-Scale Grid Information Services
This paper is concerned with the scalability of large-scale grid monitoring and information services, which are mainly used for the discovery of resources of interest. Large-scale...
Serafeim Zanikolas, Rizos Sakellariou