Sciweavers

ESWS
2010
Springer

An Unsupervised Approach for Acquiring Ontologies and RDF Data from Online Life Science Databases

13 years 7 months ago
An Unsupervised Approach for Acquiring Ontologies and RDF Data from Online Life Science Databases
In the Linked Open Data cloud one of the largest data sets, comprising of 2.5 billion triples, is derived from the Life Science domain. Yet this represents a small fraction of the total number of publicly available data sources on the Web. We briefly describe past attempts to transform specific Life Science sources from a plethora of open as well as proprietary formats into RDF data. In particular, we identify and tackle two bottlenecks in current practice: Acquiring ontologies to formally describe these data and creating “RDFizer” programs to convert data from legacy formats into RDF. We propose an unsupervised method, based on transformation rules, for performing these two key tasks, which makes use of our previous work on unsupervised wrapper induction for extracting labelled data from complete Life Science Web sites. We apply our approach to 13 real-world online Life Science databases. The learned ontologies are evaluated by domain experts as well as against gold standard ont...
Saqib Mir, Steffen Staab, Isabel Rojas
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2010
Where ESWS
Authors Saqib Mir, Steffen Staab, Isabel Rojas
Comments (0)