Sciweavers

263 search results - page 1 / 53
» Re-engineering structures from Web documents
Sort
View
WWW
2011
ACM
12 years 11 months ago
Identifying primary content from web pages and its application to web search ranking
Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In ot...
Srinivas Vadrevu, Emre Velipasaoglu
DL
2000
Springer
156views Digital Library» more  DL 2000»
13 years 9 months ago
Re-engineering structures from Web documents
To realise a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by...
Chuang-Hue Moh, Ee-Peng Lim, Wee Keong Ng
PVLDB
2010
135views more  PVLDB 2010»
13 years 3 months ago
SXPath - Extending XPath towards Spatial Querying on Web Documents
Querying data from presentation formats like HTML, for purposes such as information extraction, requires the consideration of tree structures as well as the consideration of spati...
Ermelinda Oro, Massimo Ruffolo, Steffen Staab
DOCENG
2010
ACM
13 years 3 months ago
From templates to schemas: bridging the gap between free editing and safe data processing
In this paper we present tools that provide an easy way to edit XML content directly on the web, with the usual benefit of valid XML content. These tools make it possible to crea...
Vincent Quint, Cécile Roisin, Stépha...
FLAIRS
2001
13 years 6 months ago
Syntactic Folding and its Application to the Information Extraction from Web Pages
Thepaper deals with investigations concerning potential structures of documentsthat will be subject to automated information extraction. The focus is on folding principles and the...
Jörg Herrmann