Sciweavers

820 search results - page 57 / 164
» Deep web data extraction
Sort
View
WWW
2010
ACM
15 years 4 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
WWW
2005
ACM
15 years 10 months ago
Fully automatic wrapper generation for search engines
When a query is submitted to a search engine, the search engine returns a dynamically generated result page containing the result records, each of which usually consists of a link...
Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Ragha...
KDD
2008
ACM
153views Data Mining» more  KDD 2008»
15 years 10 months ago
Information extraction from Wikipedia: moving down the long tail
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Fei Wu, Raphael Hoffmann, Daniel S. Weld
AIRWEB
2007
Springer
15 years 4 months ago
Extracting Link Spam using Biased Random Walks from Spam Seed Sets
Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...
Baoning Wu, Kumar Chellapilla
FGCS
2007
108views more  FGCS 2007»
14 years 9 months ago
From bioinformatic web portals to semantically integrated Data Grid networks
We propose a semi-automated method for redeploying bioinformatic databases indexed in a Web portal as a decentralized, semantically integrated and service-oriented Data Grid. We g...
Adriana Budura, Philippe Cudré-Mauroux, Kar...