We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
While the Internet has facilitated access to information sources, the task of scalable integration of these heterogeneous data sources remains a challenge. The adoption of the eXte...
This paper presents an architecture of ontological components for the Semantic Web. Many methods and methodologies can be found in the literature. Generally, they are dedicated to ...
Nesrine Ben Mustapha, Marie-Aude Aufaure, Hajer Ba...
Abstract—This paper revisits the classical problem of multiquery optimization in the context of RDF/SPARQL. We show that the techniques developed for relational and semi-structur...
Wangchao Le, Anastasios Kementsietsidis, Songyun D...
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...