Keyword indices, topic directories, and link-based rankings are used to search and structure the rapidly growing Web today. Surprisingly little use is made of years of browsing ex...
A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
This paper addresses the issue of Web document summarization. As textual content of Web documents is often scarce or irrelevant and existing summarization techniques are based on ...
Heterogeneous and dirty data is abundant. It is stored under different, often opaque schemata, it represents identical real-world objects multiple times, causing duplicates, and ...
Alexander Bilke, Jens Bleiholder, Christoph Bö...
Background: The behaviour of biological systems can be deduced from their mathematical models. However, multiple sources of data in diverse forms are required in the construction ...
Peter Li, Joseph O. Dada, Daniel Jameson, Irena Sp...