Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

9

AAAI
2007

favoriteEmaildiscussreport

68views Intelligent Agents» more AAAI 2007»

A Reinforcement Learning Algorithm with Polynomial Interaction Complexity for Only-Costly-Observable MDPs

13 years 6 months ago

A Reinforcement Learning Algorithm with Polynomial Interaction Complexity for Only-Costly-Observable MDPs

Download www.aaai.org

An Unobservable MDP (UMDP) is a POMDP in which there are no observations. An Only-Costly-Observable MDP (OCOMDP) is a POMDP which extends an UMDP by allowing a particular costly action which completely observes the state. We introduce UR-MAX, a reinforcement learning algorithm with polynomial interaction complexity for unknown OCOMDPs.

Roy Fox, Moshe Tennenholtz

Real-time Traffic

AAAI 2007 | Intelligent Agents | Only-Costly-Observable MDP | Particular Costly Action | Unobservable MDP |

claim paper

Related Content

» An objectoriented representation for efficient reinforcement learning

» Reinforcement learning for DECMDPs with changing action sets and partially ordered depende...

Post Info
More Details (n/a)

Added	02 Oct 2010
Updated	02 Oct 2010
Type	Conference
Year	2007
Where	AAAI
Authors	Roy Fox, Moshe Tennenholtz

Comments (0)