: Search engines--"web dragons"--are the portals through which we access society's treasure trove of information. They do not publish the algorithms they use to sort...
Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
Providing web service replicas improves the overall system performance and redundancy for hardware failures. In Business-to-Business, this may be particularly interesting for orga...
This paper presents WordRank, a new page ranking system, which exploits similarity between interconnected pages. WordRank introduces the model of the ‘biased surfer’ which is ...
Published experiments on spidering the Web suggest that, given training data in the form of a (relatively small) subgraph of the Web containing a subset of a selected class of tar...