Sciweavers

2677 search results - page 10 / 536
» Extracting Structured Data from Web Pages
Sort
View
71
Voted
AAAI
2000
15 years 29 days ago
Learning the Common Structure of Data
The proliferation of online information sources has accentuated the need for tools that automatically validate and recognize data. We present an efficient algorithm that learns st...
Kristina Lerman, Steven Minton
90
Voted
CN
2007
108views more  CN 2007»
14 years 11 months ago
On the peninsula phenomenon in web graph and its implications on web search
Web masters usually place certain web pages such as home pages and index pages in front of others. Under such a design, it is necessary to go through some pages to reach the desti...
Tao Meng, Hong-Fei Yan
KES
2006
Springer
14 years 11 months ago
Web Site Off-Line Structure Reconfiguration: A Web User Browsing Analysis
The correct web site text content must be help to the visitors to find what they are looking for. However, the reality is quite different, many times the web page text content is a...
Sebastián A. Ríos, Juan D. Vel&aacut...
VLDB
2004
ACM
121views Database» more  VLDB 2004»
15 years 5 months ago
An Automatic Data Grabber for Large Web Sites
We demonstrate a system to automatically grab data from data intensive web sites. The system first infers a model that describes at the intensional level the web site as a collec...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
174
Voted
ICDE
2004
IEEE
117views Database» more  ICDE 2004»
16 years 29 days ago
Probe, Cluster, and Discover: Focused Extraction of QA-Pagelets from the Deep Web
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...
James Caverlee, Ling Liu, David Buttler