We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1? ) factor. It is known that if the stream may consist of arbitra...
Major research challenges in discovering Web services include, provisioning of services across multiple or heterogeneous registries, differentiating between services that share si...
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Column-oriented database systems [19, 23] perform better than traditional row-oriented database systems on analytical workloads such as those found in decision support and busines...