We study methods to initialize or bias different clustering methods using prior information about the "importance" of a keyword w.r.t. the whole document collection or a...
Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For exampl...
The Web Ontology Language (OWL) defines three classes of documents: Lite, DL and Full. All RDF/XML documents are OWL Full documents, some OWL Full documents are also OWL DL docume...
This paper introduces the concept of accessibility from the field of transportation planning and adopts it within the context of Information Retrieval (IR). An analogy is drawn bet...
Despite the ubiquity of XML, research in metrics for XML documents is scarce. This paper proposes and discusses eleven metrics to measure the quality and complexity of XML Schema ...