Sciweavers

1233 search results - page 173 / 247
» Reinforcement Learning in MirrorBot
Sort
View
ICML
2010
IEEE
14 years 11 months ago
Toward Off-Policy Learning Control with Function Approximation
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Hamid Reza Maei, Csaba Szepesvári, Shalabh ...
89
Voted
IIE
2007
105views more  IIE 2007»
14 years 9 months ago
Student-Centered Support Systems to Sustain Logo-Like Learning
Conventional wisdom attributes the lack of effective technology use in classrooms to a shortage of professional development or poorly run professional development. At the same time...
Sylvia Martinez
85
Voted
ICML
2006
IEEE
15 years 10 months ago
An intrinsic reward mechanism for efficient exploration
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
Özgür Simsek, Andrew G. Barto
ICRA
2009
IEEE
143views Robotics» more  ICRA 2009»
15 years 4 months ago
Least absolute policy iteration for robust value function approximation
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...
ICRA
1994
IEEE
105views Robotics» more  ICRA 1994»
15 years 1 months ago
Harmonic Functions and Collision Probabilities
There is a close relationship between harmonic functions { which have recently been proposed for path planning { and hitting probabilities for random processes. The hitting probab...
Christopher I. Connolly