Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

173

Voted

CORR
2012
Springer

216views Education» more CORR 2012»

Fractional Moments on Bandit Problems

13 years 11 months ago

Fractional Moments on Bandit Problems

Download www.cse.iitm.ac.in

Reinforcement learning addresses the dilemma between exploration to ﬁnd profitable actions and exploitation to act according to the best observations already made. Bandit problems are one such class of problems in stateless environments that represent this explore/exploit situation. We propose a learning algorithm for bandit problems based on fractional expectation of rewards acquired. The algorithm is theoretically shown to converge on an -optimal arm and achieve O(n) sample complexity. Experimental results show the algorithm incurs substantially lower regrets than parameter-optimized -greedy and SoftMax approaches and other low sample complexity state-of-the-art techniques.

Ananda Narayanan B., Balaraman Ravindran

Real-time Traffic

Art Techniques | Bandit Problems | CORR 2012 | Education | Reinforcement Learning |

claim paper

Related Content

» Stieltjes moment problem via fractional moments

» Bounds on the Bethe Free Energy for Gaussian Networks

» Properties of Bethe Free Energies and Message Passing in Gaussian Models

» Strong Refutation Heuristics for Random kSAT

» HeavyTailed Phenomena in Satisfiability and Constraint Satisfaction Problems

» Approaching the taxonomic affiliation of unidentified sequences in public databases an ex...

» Constructive Proofs of Concentration Bounds

Post Info
More Details (n/a)

Added	20 Apr 2012
Updated	20 Apr 2012
Type	Journal
Year	2012
Where	CORR
Authors	Ananda Narayanan B., Balaraman Ravindran

Comments (0)