Sciweavers

85 search results - page 12 / 17
» Extracting unstructured data from template generated web doc...
Sort
View
96
Voted
PVLDB
2008
141views more  PVLDB 2008»
14 years 9 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
94
Voted
PAKDD
2001
ACM
157views Data Mining» more  PAKDD 2001»
15 years 2 months ago
Applying Pattern Mining to Web Information Extraction
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
Chia-Hui Chang, Shao-Chen Lui, Yen-Chin Wu
CLOUDCOM
2010
Springer
14 years 7 months ago
Efficient Metadata Generation to Enable Interactive Data Discovery over Large-Scale Scientific Data Collections
Discovering the correct dataset efficiently is critical for computations and effective simulations in scientific experiments. In contrast to searching web documents over the Intern...
Sangmi Lee Pallickara, Shrideep Pallickara, Milija...
IRI
2007
IEEE
15 years 3 months ago
Acronym-Expansion Recognition and Ranking on the Web
The paper presents a study on large-scale automatic extraction of acronyms and associated expansions from Web data and from the user interactions with this data through Web search...
Alpa Jain, Silviu Cucerzan, Saliha Azzam
FLAIRS
2007
14 years 12 months ago
The Evolution and Evaluation of an Internet Search Tool for Information Analysts
We are working on a project aimed at building next generation analyst support tools that focus analysts’ attention on the most critical and novel information found within the da...
Elizabeth T. Whitaker, Robert L. Simpson Jr.