We consider the problem of identifying a team of skilled individuals for collaboration, in the presence of a social network. Each node in the input social network may be an expert...
Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data analysis only a subset of the genes show cohesive exp...
In the paper we show that diagnostic classes in cancer gene expression data sets, which most often include thousands of features (genes), may be effectively separated with simple ...
Gregor Leban, Minca Mramor, Ivan Bratko, Blaz Zupa...
Background: The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on...
Alexander S. Yeh, Alexander A. Morgan, Marc E. Col...
Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...