Sciweavers

233 search results - page 34 / 47
» Keyword search across databases and documents
Sort
View
PVLDB
2008
141views more  PVLDB 2008»
14 years 11 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
BMCBI
2006
117views more  BMCBI 2006»
14 years 11 months ago
G-InforBIO: integrated system for microbial genomics
Background: Genome databases contain diverse kinds of information, including gene annotations and nucleotide and amino acid sequences. It is not easy to integrate such information...
Naoto Tanaka, Takashi Abe, Satoru Miyazaki, Hideak...
RSFDGRC
2011
Springer
255views Data Mining» more  RSFDGRC 2011»
14 years 2 months ago
Construction and Analysis of Web-Based Computer Science Information Networks
WINACS (Web-based Information Network Analysis for Computer Science) is a project that incorporates many recent, exciting developments in data sciences to construct a Web-based co...
Jiawei Han
WWW
2010
ACM
15 years 6 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
WSDM
2009
ACM
113views Data Mining» more  WSDM 2009»
15 years 6 months ago
Time Will Tell: Leveraging Temporal Expressions in IR
Temporal expressions, such as between 1992 and 2000, are frequent across many kinds of documents. Text retrieval, though, treats them as common terms, thus ignoring their inherent...
Irem Arikan, Srikanta J. Bedathur, Klaus Berberich