Optimizing Query Processing for the Hidden Web

12 years 28 days ago
Optimizing Query Processing for the Hidden Web
Abstract. The term Deep Web (sometimes also called Hidden Web) refers to the data content that is created dynamically as the result of a specific search on the Web. In this respect, such content resides outside web pages, and is only accessible through interaction with the web site – typically via HTML forms. It is believed that the size of the Deep Web is several orders of magnitude larger than that of the so-called Surface Web, i.e., the web that is accessible and indexable by search engines. Usually, data sources accessible through web forms are modeled by relations that require certain fields to be selected – i.e., some fields in the form need to be filled in. These requirements are commonly referred to as access limitations in that access to data can only take place according to given patterns. Besides data accessible through web forms, access limitations may also occur i) in legacy systems where data scattered over several files are wrapped as relational tables, and ii) ...
Andrea Calì, Davide Martinenghi
Added 18 Jul 2010
Updated 18 Jul 2010
Type Conference
Year 2010
Authors Andrea Calì, Davide Martinenghi
Comments (0)