Web extraction systems attempt to use the immense amount of unlabeled text in the Web in order to create large lists of entities and relations. Unlike traditional IE methods, the ...
The approach presented in this paper is intended for the semi-automatic construction of a learning object repository from HTML pages. An extraction method consists of applying the...
Relation extraction is a difficult open research problem with important applications in several fields such as knowledge management, web mining, ontology building, intelligent sys...
Linguists and geographers are more and more interested in route direction documents because they contain interesting motion descriptions and language patterns. A large number of s...
Xiao Zhang, Prasenjit Mitra, Sen Xu, Anuj R. Jaisw...
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...