Effective daily processing of large amounts of paper documents in office environments requires the application of semantic-based indexing techniques during the transformation of pa...
In this paper, we present an extension of PHIL, a declarative language for filtering information from XML data. The proposed approach allows us to extract relevant data as well a...
Previous studies of incomplete XML documents have identified three main sources of incompleteness – in structural information, data values, and labeling – and addressed data ...
: We present in this paper a transformation model for structured documents. TransM is a new model that deals with specified documents, where the structure conforms to a predefined ...
Nouhad Amaneddine, Jean Paul Bahsoun, Jean-Paul Bo...
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...