Sciweavers

2677 search results - page 195 / 536
» Extracting Structured Data from Web Pages
Sort
View
169
Voted
CIDR
2011
243views Algorithms» more  CIDR 2011»
14 years 7 months ago
Longitudinal Analytics on Web Archive Data: It's About Time!
Organizations like the Internet Archive have been capturing Web contents over decades, building up huge repositories of time-versioned pages. The timestamp annotations and the she...
Gerhard Weikum, Nikos Ntarmos, Marc Spaniol, Peter...
123
Voted
SEMWEB
2009
Springer
15 years 10 months ago
Graph-Based Ontology Construction from Heterogenous Evidences
Abstract. Ontologies are tools for describing and structuring knowledge, with many applications in searching and analyzing complex knowledge bases. Since building them manually is ...
Christoph Böhm, Philip Groth, Ulf Leser
DEBU
2000
90views more  DEBU 2000»
15 years 3 months ago
Personal Views for Web Catalogs
Large growth in e-commerce has culiminated in technology boom to enable companies to better serve their consumers. The front-end of the e-commerce business is to better reach the ...
Kajal T. Claypool, Li Chen, Elke A. Rundensteiner
WWW
2008
ACM
16 years 4 months ago
Recrawl scheduling based on information longevity
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
Christopher Olston, Sandeep Pandey
ICCV
2005
IEEE
15 years 9 months ago
Learning Non-Generative Grammatical Models for Document Analysis
— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
Michael Shilman, Percy Liang, Paul A. Viola