We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each ca...
This paper details the participation of the XLDB group from the University of Lisbon at the GeoCLEF task of CLEF 2006. We tested text mining methods that make use of an ontology t...
Bruno Martins, Nuno Cardoso, Marcirio Silveira Cha...
When multiple ontologies are used within one application system, aligning the ontologies is a prerequisite for interoperability and unhampered semantic navigation and search. Vario...
In this paper, we present a scheme for identifying instances of events and extracting information about them. The scheme can handle all events with which an action can be associat...
Harsha V. Madhyastha, N. Balakrishnan, K. R. Ramak...