Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

9

ICML
2006
IEEE

favoriteEmaildiscussreport

142views Machine Learning» more ICML 2006»

An intrinsic reward mechanism for efficient exploration

14 years 5 months ago

An intrinsic reward mechanism for efficient exploration

Download www-anw.cs.umass.edu

How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exploit later? We formulate this problem as a Markov Decision Process by explicitly modeling the internal state of the agent and propose a principled heuristic for its solution. We present experimental results in a number of domains, also exploring the algorithm's use for learning a policy for a skill given its reward function--an important but neglected component of skill discovery.

Özgür Simsek, Andrew G. Barto

Real-time Traffic

ICML 2006 | Machine Learning | Markov Decision Process | Reinforcement Learning Agent | Skill Discovery |

claim paper

Related Content

» Adaptive curiosity for emotions detection in speech

» Cortical network reorganization guided by sensory input features

» RMAX A General Polynomial Time Algorithm for NearOptimal Reinforcement Learning

» Hardwaresoftware support for adaptive workstealing in onchip multiprocessor

» A Study of Adaptive Locomotive Behaviors of a Biped Robot Patterns Generation and Classifi...

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2006
Where	ICML
Authors	Özgür Simsek, Andrew G. Barto

Comments (0)