Sciweavers

1233 search results - page 198 / 247
» Reinforcement learning
Sort
View
122
Voted
ICML
2009
IEEE
16 years 4 months ago
Monte-Carlo simulation balancing
In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...
David Silver, Gerald Tesauro
ICML
2001
IEEE
16 years 4 months ago
Direct Policy Search using Paired Statistical Tests
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...
Malcolm J. A. Strens, Andrew W. Moore
ECML
2007
Springer
15 years 10 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
NIPS
2008
15 years 5 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
143
Voted
IIE
2007
105views more  IIE 2007»
15 years 3 months ago
Student-Centered Support Systems to Sustain Logo-Like Learning
Conventional wisdom attributes the lack of effective technology use in classrooms to a shortage of professional development or poorly run professional development. At the same time...
Sylvia Martinez