Sciweavers

1188 search results - page 32 / 238
» Extraction of Informative Expressions from Domain-specific D...
Sort
View
87
Voted
ICPR
2010
IEEE
14 years 7 months ago
Learning Image Anchor Templates for Document Classification and Data Extraction
Image anchor templates are used in document image analysis for document classification, data localization, and other tasks. Current tools allow human operators to mark out small s...
Prateek Sarkar
ELPUB
2006
ACM
15 years 3 months ago
Automated Building of OAI Compliant Repository from Legacy Collection
In this paper, we report on our experience with the creation of an automated, human-assisted process to extract metadata from documents in a large (>100,000), dynamically growi...
Jianfeng Tang, Kurt Maly, Steven J. Zeil, Mohammad...
WWW
2004
ACM
15 years 10 months ago
Automatically collecting, monitoring, and mining japanese weblogs
We present a system that tries to automatically collect and monitor Japanese blog collections that include not only ones made with blog softwares but also ones written as normal w...
Tomoyuki Nanno, Toshiaki Fujiki, Yasuhiro Suzuki, ...
LWA
2008
14 years 11 months ago
Rule-Based Information Extraction for Structured Data Acquisition using TextMarker
Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the ...
Martin Atzmüller, Peter Klügl, Frank Pup...
ACMICEC
2006
ACM
141views ECommerce» more  ACMICEC 2006»
15 years 3 months ago
From HTML documents to web tables and rules
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Kai Simon, Georg Lausen, Harold Boley