Sciweavers

2108 search results - page 242 / 422
» Tracking in Reinforcement Learning
Sort
View
115
Voted
ICML
2001
IEEE
16 years 1 months ago
Direct Policy Search using Paired Statistical Tests
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...
Malcolm J. A. Strens, Andrew W. Moore
100
Voted
ECML
2007
Springer
15 years 6 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
118
Voted
ACSE
2000
ACM
15 years 4 months ago
The information environments program - a new design based IT degree
The University of Queensland has recently established a new design-focused, studio-based IT degree at a new “flexible-learning” campus. The Bachelor of Information Environment...
Michael Docherty, Peter Sutton, Margot Brereton, S...
ICCS
1993
Springer
15 years 4 months ago
Towards Domain-Independent Machine Intelligence
Adaptive predictive search (APS), is a learning system framework, which given little initial domain knowledge, increases its decision-making abilities in complex problems domains....
Robert Levinson
124
Voted
NIPS
2008
15 years 2 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake