Bandits for Taxonomies: A Model-based Approach

11 years 2 months ago
Bandits for Taxonomies: A Model-based Approach
We consider a novel problem of learning an optimal matching, in an online fashion, between two feature spaces that are organized as taxonomies. We formulate this as a multi-armed bandit problem where the arms of the bandit are dependent due to the structure induced by the taxonomies. We then propose a multi-stage hierarchical allocation scheme that improves the explore/exploit properties of the classical multiarmed bandit policies in this scenario. In particular, our scheme uses the taxonomy structure and performs shrinkage estimation in a Bayesian framework to exploit dependencies among the arms, thereby enhancing exploration without losing efficiency on short term exploitation. We prove that our scheme asymptotically converges to the optimal matching. We conduct extensive experiments on real data to illustrate the efficacy of our scheme in practice.
Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabar
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2007
Where SDM
Authors Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti, Vanja Josifovski
Comments (0)