Conventionally, Web pages have been recognized as documents described by HTML. Image data, such as photographs, logos, maps, illustrations, and decorated text, have been treated a...
This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...
Abstract. We propose a novel approach to reverse engineering of relational databases to ontologies. Our approach is based on the idea that semantics of a relational database can be...
A substantial subset of the web data follows some kind of underlying structure. In order to let software programs gain full benefit from these “semistructured” web sources, wra...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...