We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
Creating video recordings of events such as lectures or meetings is increasingly inexpensive and easy. However, reviewing the content of such video may be time-consuming and dif...
In this paper we propose a methodology to learn to extract domain-specific information from large repositories (e.g. the Web) with minimum user intervention. Learning is seeded b...
Fabio Ciravegna, Alexiei Dingli, David Guthrie, Yo...
Traditionally, machine learning approaches for information extraction require human annotated data that can be costly and time-consuming to produce. However, in many cases, there ...
Abstract. This paper describes the setup of the Book Structure Extraction competition run at ICDAR 2009. The goal of the competition was to evaluate and compare automatic technique...
Antoine Doucet, Gabriella Kazai, Bodin Dresevic, A...