In this paper, we study the problem of measuring structural similarities of large number of source schemas against a single domain schema, which is useful for enhancing the qualit...
This paper addresses the problem of extending an adaptive information filtering system to make decisions about the novelty and redundancy of relevant documents. It argues that rel...
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites powered...
For the RISM A/II collection of musical incipits (short extracts of scores, taken from the beginning), we have established a ground truth based on the opinions of human experts. I...
Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...