Sciweavers

602 search results - page 42 / 121
» Integrating Data and Probabilistically Structured Text Docum...
Sort
View
ICDE
2007
IEEE
128views Database» more  ICDE 2007»
16 years 1 months ago
SQL Queries Over Unstructured Text Databases
Text documents often embed data that is structured in nature. By processing a text database with information extraction systems, we can define a variety of structured "relati...
Alpa Jain, AnHai Doan, Luis Gravano
ISEMANTICS
2010
15 years 1 months ago
STEX+: a system for flexible formalization of linked data
We present the STEX system, a semantic extension of LATEX, that allows for producing high-quality PDF documents for (proof)reading and printing, as well as semantic XML/OMDoc docu...
Andrea Kohlhase, Michael Kohlhase, Christoph Lange...
DOCENG
2009
ACM
15 years 6 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
IJMMS
2008
108views more  IJMMS 2008»
14 years 12 months ago
Ontology-based information extraction and integration from heterogeneous data sources
In this paper we present the design, implementation and evaluation of SOBA, a system for ontology-based information extraction from heterogeneous data resources, including plain t...
Paul Buitelaar, Philipp Cimiano, Anette Frank, Mat...
ICDE
2010
IEEE
288views Database» more  ICDE 2010»
15 years 11 months ago
Fast In-Memory XPath Search using Compressed Indexes
A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicate...
Diego Arroyuelo, Francisco Claude, Sebastian Manet...