Sciweavers

642 search results - page 70 / 129
» Automatic Wrapper Generation for Web Search Engines
Sort
View
PVLDB
2008
141views more  PVLDB 2008»
14 years 9 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
PKDD
2004
Springer
205views Data Mining» more  PKDD 2004»
15 years 3 months ago
Breaking Through the Syntax Barrier: Searching with Entities and Relations
The next wave in search technology will be driven by the identification, extraction, and exploitation of real-world entities represented in unstructured textual sources. Search sy...
Soumen Chakrabarti
SIGIR
2004
ACM
15 years 3 months ago
Parameterized generation of labeled datasets for text categorization based on a hierarchical directory
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
Dmitry Davidov, Evgeniy Gabrilovich, Shaul Markovi...
JCDL
2009
ACM
168views Education» more  JCDL 2009»
15 years 4 months ago
A framework for describing web repositories
In prior work we have demonstrated that search engine caches and archiving projects like the Internet Archive’s Wayback Machine can be used to “lazily preserve” websites and...
Frank McCown, Michael L. Nelson
MIR
2006
ACM
200views Multimedia» more  MIR 2006»
15 years 3 months ago
An adaptive graph model for automatic image annotation
Automatic keyword annotation is a promising solution to enable more effective image search by using keywords. In this paper, we propose a novel automatic image annotation method b...
Jing Liu, Mingjing Li, Wei-Ying Ma, Qingshan Liu, ...