A method of determining the similarity of nouns on the basis of a metric derived from the distribution of subject, verb and object in a large text corpus is described. The resulti...
As random access memory gets cheaper, it becomes increasingly affordable to build computers with large main memories. We consider decision support workloads within the context of...
Sequential pattern mining has been an emerging problem in data mining. In this paper, we propose a new algorithm for mining frequent sequences. It processes only one scan of the da...
As many metadata are encoded in XML, and many digital libraries need to manage XML documents, efficient techniques for searching in such formatted data are required. In order to eï...
Giuseppe Amato, Franca Debole, Pavel Zezula, Faust...
We present the algorithmic core of a full text data base that allows fast Boolean queries, phrase queries, and document reporting using less space than the input text. The system ...