Sciweavers

5575 search results - page 162 / 1115
» Information Extraction
Sort
View
WWW
2010
ACM
15 years 11 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
JCDL
2006
ACM
237views Education» more  JCDL 2006»
15 years 10 months ago
Automatic extraction of table metadata from digital documents
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and high...
Ying Liu, Prasenjit Mitra, C. Lee Giles, Kun Bai
242
Voted
ICDE
2009
IEEE
120views Database» more  ICDE 2009»
16 years 6 months ago
Weighted Proximity Best-Joins for Information Retrieval
We consider the problem of efficiently computing weighted proximity best-joins over multiple lists, with applications in information retrieval and extraction. We are given a multi-...
AnHai Doan, Haixun Wang, Hao He, Jun Yang 0001, Ri...
121
Voted
ICDAR
2009
IEEE
15 years 2 months ago
Disease-Specific Extraction of Text from Cardiac Echo Videos for Decision Support
Echo videos are an important modality for cardiac decision support. In addition to describing the shape and motion of the heart, they capture important diagnostic measurements as ...
Tanveer Fathima Syeda-Mahmood, David Beymer, Arnon...
WWW
2007
ACM
16 years 5 months ago
Adaptive record extraction from web pages
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
Justin Park, Denilson Barbosa