9 years 8 months ago
Prophiler: a fast filter for the large-scale detection of malicious web pages
Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and, subsequently, for creating large-scale botnets. In...
Davide Canali, Marco Cova, Giovanni Vigna, Christo...
9 years 11 months ago
Language Based Crawling: Crawling the Arabic Content of the Web
- Crawling web pages written in Arabic or any other language with limited content in the web may, at first, seem to parallel the process of crawling the English content. However, t...
Saad H. Alabbad, Sultan Alanazi
9 years 11 months ago
The Impact of Query Refinement in the Web People Search Task
Searching for a person name in a Web Search Engine usually leads to a number of web pages that refer to several people sharing the same name. In this paper we study whether it is ...
Javier Artiles, Julio Gonzalo, Enrique Amigó...
9 years 12 months ago
Enriching the Contents of Enterprises' Wiki Systems with Web Information
Abstract. Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be...
Li Zhao, Yexin Wang, Congrui Huang, Yan Zhang
10 years 3 days ago
DSMW: Distributed Semantic MediaWiki
DSMW is an extension to Semantic Mediawiki (SMW), it allows to create a network of SMW servers that share common semantic wiki pages. DSMW users can create communication channels b...
Hala Skaf-Molli, Gérôme Canals, Pasca...
119views more  JIS 2008»
10 years 1 months ago
A three-year study on the freshness of web search engine databases
This paper deals with one aspect of the index quality of search engines: index freshness. The purpose is to analyse the update strategies of the major Web search engines Google, Y...
Dirk Lewandowski
90views more  ENTCS 2008»
10 years 1 months ago
Formal Verification of Websites
In this paper, a model for websites is presented. The model is well-suited for the formal verification of dynamic as well as static properties of the system. A website is defined ...
Sonia Flores, Salvador Lucas, Alicia Villanueva
125views more  TSMC 2002»
10 years 1 months ago
Dynamic page based crossover in linear genetic programming
Page-based Linear Genetic Programming (GP) is proposed in which individuals are described in terms of a number of pages. Pages are expressed in terms of a fixed number of instructi...
Malcolm I. Heywood, A. Nur Zincir-Heywood
111views more  TKDE 2002»
10 years 1 months ago
Query Relaxation by Structure and Semantics for Retrieval of Logical Web Documents
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...
Wen-Syan Li, K. Selçuk Candan, Quoc Vu, Div...
168views more  JIIS 2002»
10 years 1 months ago
Hidden Markov Models for Text Categorization in Multi-Page Documents
In the traditional setting, text categorization is formulated as a concept learning problem where each instance is a single isolated document. However, this perspective is not appr...
Paolo Frasconi, Giovanni Soda, Alessandro Vullo