— One of the critical issues in search engines is the size of search indexes: as the number of documents handled by an engine increases, the search must preserve its efficiency,...
Abstract. Digital numbers D are the world’s most popular data representation: nearly all texts, sounds and images are coded somewhere in time and space by binary sequences. The m...
Background: DNA Microarray technology is an innovative methodology in experimental molecular biology, which has produced huge amounts of valuable data in the profile of gene expre...
Background: Cluster analysis is an important technique for the exploratory analysis of biological data. Such data is often high-dimensional, inherently noisy and contains outliers...
Benjamin Georgi, Ivan Gesteira Costa, Alexander Sc...
The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the r...