It is observed that a better approach to Web information understanding is to base on its document framework, which is mainly consisted of (i) the title and the URL name of the pag...
— The rendering mechanism used in Web browsers have a significant impact on the user behavior and delay tolerance of retrieval. The head-of-line blocking phenomena prevents the ...
Cloning is extremely likely to occur in web sites, much more so than in other software. While some clones exist for valid reasons, or are too small to eliminate, cloning percentag...
We present our vision and preliminary design toward web-crawled academic video search engine, named as LeeDeo, that can search, crawl, archive, index, and browse “academic” vi...
Dongwon Lee, Hung-sik Kim, Eun Kyung Kim, Su Yan, ...
This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...