Search Sciweavers | Sciweavers

10 search results - page 1 / 2

» Near-optimal Regret Bounds for Reinforcement Learning

click to vote

ML
2002
ACM

121views Machine Learning» more ML 2002»

Near-Optimal Reinforcement Learning in Polynomial Time

13 years 4 months ago

Download www.cis.upenn.edu

We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...

Michael J. Kearns, Satinder P. Singh

claim paper

Read More »

click to vote

NIPS
2008

98views Information Technology» more NIPS 2008»

Near-optimal Regret Bounds for Reinforcement Learning

13 years 6 months ago

Download eprints.pascal-network.org

Peter Auer, Thomas Jaksch, Ronald Ortner

claim paper

Read More »

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

13 years 2 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

click to vote

CORR
2010
Springer

105views Education» more CORR 2010»

Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence

13 years 3 months ago

Download hal.archives-ouvertes.fr

We consider model-based reinforcement learning in ﬁnite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...

Sarah Filippi, Olivier Cappé, Aurelien Gari...

claim paper

Read More »

click to vote

AAAI
2004

135views Intelligent Agents» more AAAI 2004»

Performance Bounded Reinforcement Learning in Strategic Interactions

13 years 6 months ago

Download www.aaai.org

Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. T...

Bikramjit Banerjee, Jing Peng

claim paper

Read More »

« Prev « First page 1 / 2 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers