Abstract. We present an information-theoretic framework for mining dependencies between itemsets in binary data. The problem of closure-based redundancy in this context is theoreti...
In this investigation, we propose a probabilistic approach for estimating the ages of Blog authors by means of Naive Bayesian Classifier. We can learn context of characteristic wor...
We introduce a robust and efficient framework called CLUMP (CLustering Using Multiple Prototypes) for unsupervised discovery of structure in data. CLUMP relies on finding multip...
We begin with pervasive ultrametricity due to high dimensionality and/or spatial sparsity. How extent or degree of ultrametricity can be quantified leads us to the discussion of va...
Abstract. Latent semantic indexing (LSI) is an application of numerical method called singular value decomposition (SVD), which discovers latent semantic in documents by creating c...