Site maps are frequently provided on Web sites as a navigation support for Web users. The automatic generation of site maps is a complex task since the structure of the data, sema...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
In order to artificially boost the rank of commercial pages in search engine results, search engine optimizers pay for links to these pages on other websites. Identifying paid lin...
Understanding intents from search queries can improve a user’s search experience and boost a site’s advertising profits. Query tagging via statistical sequential labeling mode...
Ye-Yi Wang, Raphael Hoffmann, Xiao Li, Jakub Szyma...
The paper presents an approach to combine knowledge from memory and brain sciences with information retrieval research in the design of Web agents. An information retrieval agent f...