Sciweavers

611 search results - page 49 / 123
» Random web crawls
Sort
View
70
Voted
ICASSP
2010
IEEE
14 years 8 months ago
Exploring statistical properties for semantic annotation: sparse distributed and convergent assumptions for keywords
Does there exist a compact set of visual topics in form of keyword clusters capable to represent all images visual content within an acceptable error? In this paper, we answer thi...
Xianming Liu, Hongxun Yao, Rongrong Ji
ISW
2010
Springer
14 years 7 months ago
An Architecture for Enforcing JavaScript Randomization in Web2.0 Applications
Abstract. Instruction Set Randomization (ISR) is a promising technique for preventing code-injection attacks. In this paper we present a complete randomization framework for JavaSc...
Elias Athanasopoulos, Antonis Krithinakis, Evangel...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
15 years 4 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
WWW
2009
ACM
15 years 10 months ago
Triplify: light-weight linked data publication from relational databases
In this paper we present Triplify ? a simplistic but effective approach to publish Linked Data from relational databases. Triplify is based on mapping HTTP-URI requests onto relat...
Sören Auer, Sebastian Dietzold, Jens Lehmann,...
ERCIMDL
2005
Springer
113views Education» more  ERCIMDL 2005»
15 years 3 months ago
mod_oai: An Apache Module for Metadata Harvesting
We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The OAI-PMH is the de facto standard for metadata...
Michael L. Nelson, Herbert Van de Sompel, Xiaoming...