A key issue in supervised protein classification is the representation of input sequences of amino acids. Recent work using string kernels for protein data has achieved state-of-t...
Jason Weston, Christina S. Leslie, Eugene Ie, Deng...
In this demo we present the cgmOLAP server, the first fully functional parallel OLAP system able to build data cubes at a rate of more than 1 Terabyte per hour. cgmOLAP incorporat...
Ying Chen, Andrew Rau-Chaplin, Frank K. H. A. Dehn...
In situations where class labels are known for a part of the objects, a cluster analysis respecting this information, i.e. semi-supervised clustering, can give insight into the cl...
Clustering can be defined as a data assignment problem where the goal is to partition the data into nonhierarchical groups of items. In our previous work, we suggested an informati...
This paper is concerned with efficient querying of very large multi-resolution datasets on storage and compute clusters. We present a suite of services that support storage, index...