Sciweavers

784 search results - page 96 / 157
» Information Extraction from Multimodal ECG Documents
Sort
View
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
15 years 4 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
SIGIR
2006
ACM
15 years 3 months ago
Feature diversity in cluster ensembles for robust document clustering
The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one ...
Xavier Sevillano, Germán Cobo, Francesc Al&...
ADC
2006
Springer
130views Database» more  ADC 2006»
15 years 3 months ago
A two-phase rule generation and optimization approach for wrapper generation
Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents...
Yanan Hao, Yanchun Zhang
85
Voted
ICDAR
2003
IEEE
15 years 3 months ago
A Model-based Line Detection Algorithm in Documents
In this paper we present a novel model based approach to detect severely broken parallel lines in noisy textual documents. It is important to detect and remove these lines so the ...
Yefeng Zheng, Huiping Li, David S. Doermann
COST
1994
Springer
159views Multimedia» more  COST 1994»
15 years 1 months ago
A Mail-Based Teleservice Architecture for Archiving and Retrieving Dynamically Composable Multimedia Documents
In this paper, a teleservice for archiving and retrieving multimedia documents using public networks is described. This teleservice encourages a broad range of commercially applic...
Heiko Thimm, Katja Röhr, Thomas C. Rakow