Sciweavers

ICDM
2007
IEEE

Sample Selection for Maximal Diversity

14 years 2 months ago
Sample Selection for Maximal Diversity
The problem of selecting a sample subset sufficient to preserve diversity arises in many applications. One example is in the design of recombinant inbred lines (RIL) for genetic association studies. In this context, genetic diversity is measured by how many alleles are retained in the resulting inbred strains. RIL panels that are derived from more than two parental strains, such as the Collaborative Cross [2, 14], present a particular challenge with regard to which of the many existing lab mouse strains should be included in the initial breeding funnel in order to maximize allele retention. A similar problem occurs in the study of customer reviews when selecting a subset of products with a maximal diversity in reviews. Diversity in this case implies the presence of a set of products having both positive and negative ranks for each customer. In this paper, we demonstrate that selecting an optimal diversity subset is an NP-complete problem via reduction to set cover. This reduction is s...
Feng Pan, Adam Roberts, Leonard McMillan, David Th
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2007
Where ICDM
Authors Feng Pan, Adam Roberts, Leonard McMillan, David Threadgill, Wei Wang 0010
Comments (0)