Often the best performing supervised learning models are ensembles of hundreds or thousands of base-level classifiers. Unfortunately, the space required to store this many classif...
Cristian Bucila, Rich Caruana, Alexandru Niculescu...
Background: Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. Howeve...
Paolo Ferragina, Raffaele Giancarlo, Valentina Gre...
This paper proposes the compression of data in Relational Database Management Systems (RDBMS) using existing text compression algorithms. Although the technique proposed is general...
Determining similarities among multimedia objects is a fundamental task in many content-based retrieval, analysis, mining, and exploration applications. Among state-of-the-art sim...
Given multiple time sequences with missing values, we propose DynaMMo which summarizes, compresses, and finds latent variables. The idea is to discover hidden variables and learn ...
Lei Li, James McCann, Nancy S. Pollard, Christos F...