P-Jigsaw is an extension of W3C's Jigsaw Web-server implementing a cache management strategy for replacement and pre-fetching based on association rules mining from the acces...
We introduce an active data mining paradigm that combines the recent work in data mining with the rich literature on active database systems. In this paradigm, data is continuousl...
Information extraction is one of the most important techniques used in Text Mining. One of the main problems in building information extraction (IE) systems is that the knowledge ...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Advances in imaging techniques have led to large repositories of images. There is an increasing demand for automated systems that can analyze complex medical images and extract me...