The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each ca...
We introduced a novel method employing a hierarchical domain ontology structure to extract features representing documents in our previous publication (Wang 2002). All raw words i...
Bill B. Wang, Robert I. McKay, Hussein A. Abbass, ...
: We have constructed a set of ontologies modelled on conceptual structures elicited from several domain experts. Protocols were collected from various experts who advise on the se...
: Digital libraries are invaluable repositories of information. However, in many situations, their size makes it difficult to access the desired resource. In this paper, we present...
In this paper, we report the development and experiments of IBM Content Harvester (CH), a tool to analyze and recover templates and content from word processor created text docume...