Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
Many applications which use web data extract information from a limited number of regions on a web page. As such, web page division into blocks and the subsequent block classifica...
ost abstract sense, we build web pages so that computers can read them. The software that people use to access web pages is what "reads" the document. How the page is ren...
WebTracer is a new usability evaluation environment that supports recording, replaying, and analysis of a gazing point and operation while a user is browsing a website. WebTracer ...
Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents...