Sciweavers

81 search results - page 16 / 17
» The Optimal Reward Baseline for Gradient-Based Reinforcement...
Sort
View
IJRR
2008
139views more  IJRR 2008»
13 years 5 months ago
Learning to Control in Operational Space
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importan...
Jan Peters, Stefan Schaal
ICML
1999
IEEE
14 years 6 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
ICML
1996
IEEE
14 years 6 months ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore
CIMCA
2008
IEEE
14 years 8 days ago
Tree Exploration for Bayesian RL Exploration
Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, ...
Christos Dimitrakakis
NIPS
2008
13 years 7 months ago
Goal-directed decision making in prefrontal cortex: a computational framework
Research in animal learning and behavioral neuroscience has distinguished between two forms of action control: a habit-based form, which relies on stored action values, and a goal...
Matthew Botvinick, James An