Sciweavers

1380 search results - page 194 / 276
» Compacting XML Documents
Sort
View
ACMICEC
2006
ACM
141views ECommerce» more  ACMICEC 2006»
15 years 5 months ago
From HTML documents to web tables and rules
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Kai Simon, Georg Lausen, Harold Boley
JCDL
2010
ACM
143views Education» more  JCDL 2010»
15 years 4 months ago
ProcessTron: efficient semi-automated markup generation for scientific documents
Digitizing legacy documents and marking them up with XML is important for many scientific domains. However, creating comprehensive semantic markup of high quality is challenging. ...
Guido Sautter, Klemens Böhm, Conny Kühne...
SIGIR
2002
ACM
14 years 11 months ago
Unsupervised document classification using sequential information maximization
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
Noam Slonim, Nir Friedman, Naftali Tishby
SADFE
2009
IEEE
15 years 6 months ago
Automating Disk Forensic Processing with SleuthKit, XML and Python
We have developed a program called fiwalk which produces detailed XML describing all of the partitions and files on a hard drive or disk image, as well as any extractable metadat...
Simson L. Garfinkel
ERCIMDL
2005
Springer
82views Education» more  ERCIMDL 2005»
15 years 4 months ago
A Native XML Database Supporting Approximate Match Search
XML is becoming the standard representation format for metadata. Metadata for multimedia documents, as for instance MPEG-7, require approximate match search functionalities to be s...
Giuseppe Amato, Franca Debole