A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
Recent advances in information retrieval over hyperlinked corpora have convincinglydemonstratedthat links carry less noisy information than text. We investigate the feasibility of...
Single-Sign-On (SSO) protocols enable companies to establish a federated environment in which clients sign in the system once and yet are able to access to services offered by dif...
The great success of Web 2.0 is mainly fuelled by an infrastructure that allows web users to create, share, tag, and connect content and knowledge easily. The tools for developing...
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...