In this paper we develop a novel measure of information in a random variable based on its cumulative distribution that we dub cumulative residual entropy (CRE). This measure parall...
Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extract...
More and more applications rely heavily on large amounts of data in the distributed storages collected over time or produced by large scale scientific experiments or simulations. ...
Abstract. Due to the dynamic nature of online information, XML documents typically evolve over time. The change of the data values or structures of an XML document may exhibit some...
Ling Chen 0002, Sourav S. Bhowmick, Liang-Tien Chi...
: We present a practical approach to nonparametric cluster analysis of large data sets. The number of clusters and the cluster centres are automatically derived by mode seeking wit...