Sciweavers

1950 search results - page 82 / 390
» Informative sampling for large unbalanced data sets
Sort
View
PAMI
2010
276views more  PAMI 2010»
14 years 12 months ago
Local-Learning-Based Feature Selection for High-Dimensional Data Analysis
—This paper considers feature selection for data classification in the presence of a huge number of irrelevant features. We propose a new feature selection algorithm that addres...
Yijun Sun, Sinisa Todorovic, Steve Goodison
SIGIR
2003
ACM
15 years 6 months ago
Using manually-built web directories for automatic evaluation of known-item retrieval
Information retrieval system evaluation is complicated by the need for manually assessed relevance judgments. Large manually-built directories on the web open the door to new eval...
Steven M. Beitzel, Eric C. Jensen, Abdur Chowdhury...
KDD
2006
ACM
142views Data Mining» more  KDD 2006»
16 years 1 months ago
Mining distance-based outliers from large databases in any metric space
Let R be a set of objects. An object o R is an outlier, if there exist less than k objects in R whose distances to o are at most r. The values of k, r, and the distance metric ar...
Yufei Tao, Xiaokui Xiao, Shuigeng Zhou
BMCBI
2010
151views more  BMCBI 2010»
15 years 1 months ago
Data reduction for spectral clustering to analyze high throughput flow cytometry data
Background: Recent biological discoveries have shown that clustering large datasets is essential for better understanding biology in many areas. Spectral clustering in particular ...
Habil Zare, Parisa Shooshtari, Arvind Gupta, Ryan ...
RECOMB
2006
Springer
16 years 1 months ago
Efficient Enumeration of Phylogenetically Informative Substrings
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore co...
Stanislav Angelov, Boulos Harb, Sampath Kannan, Sa...