We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
A link farm is a set of web pages constructed to mislead the importance of target pages in search engine results by boosting their link-based ranking scores. In this paper, we int...
We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
For many users with a disability it can be difficult or impossible to use a computer mouse to navigate the web. An alternative way to select elements on a web page is the label ty...
This paper presents a semantic wiki prototype application named SHAWN that allows structuring concepts within a wiki environment. To entice the use of Semantic Web technologies app...