Search Sciweavers | Sciweavers

473 search results - page 15 / 95

» Optimal policy switching algorithms for reinforcement learni...

click to vote

ICML
2006
IEEE

101views Machine Learning» more ICML 2006»

Qualitative reinforcement learning

16 years 17 days ago

Download www.cs.uiuc.edu

When the transition probabilities and rewards of a Markov Decision Process are specified exactly, the problem can be solved without any interaction with the environment. When no s...

Arkady Epshteyn, Gerald DeJong

claim paper

Read More »

109

click to vote

ATAL
2008
Springer

160views Intelligent Agents» more ATAL 2008»

Sequential decision making in repeated coalition formation under uncertainty

15 years 1 months ago

Download www.aamas-conference.org

The problem of coalition formation when agents are uncertain about the types or capabilities of their potential partners is a critical one. In [3] a Bayesian reinforcement learnin...

Georgios Chalkiadakis, Craig Boutilier

claim paper

Read More »

click to vote

IJCAI
2003

169views Artificial Intelligence» more IJCAI 2003»

Covariant Policy Search

15 years 1 months ago

Download www.ri.cmu.edu

We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...

J. Andrew Bagnell, Jeff G. Schneider

claim paper

Read More »

Voted

ATAL
2004
Springer

116views Intelligent Agents» more ATAL 2004»

Time-Extended Policies in Multi-Agent Reinforcement Learning

15 years 5 months ago

Download web.engr.oregonstate.edu

Many algorithms such as Q-learning successfully address reinforcement learning in single-agent multi-time-step problems. In addition there are methods that address reinforcement l...

Kagan Tumer, Adrian K. Agogino

claim paper

Read More »

137

click to vote

ECML
2006
Springer

146views Machine Learning» more ECML 2006»

Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions

15 years 3 months ago

Download www.montefiore.ulg.ac.be

We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJ...

Sébastien Jodogne, Justus H. Piater

claim paper

Read More »

« Prev « First page 15 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers