In this paper, we present an extension of PHIL, a declarative language for filtering information from XML data. The proposed approach allows us to extract relevant data as well a...
Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reflect...
Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, V...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
The demand for ontologies is rapidly growing especially due to developments in knowledge management, E-commerce and the Semantic Web. Building an ontology and a background knowled...
We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a ...