Sciweavers

ACMSE
2010
ACM
13 years 2 months ago
Generating three binary addition algorithms using reinforcement programming
Reinforcement Programming (RP) is a new technique for automatically generating a computer program using reinforcement learning methods. This paper describes how RP learned to gene...
Spencer K. White, Tony R. Martinez, George L. Rudo...
AR
2008
118views more  AR 2008»
13 years 4 months ago
Efficient Behavior Learning Based on State Value Estimation of Self and Others
The existing reinforcement learning methods have been seriously suffering from the curse of dimension problem especially when they are applied to multiagent dynamic environments. ...
Yasutake Takahashi, Kentarou Noma, Minoru Asada
ICMLA
2003
13 years 5 months ago
A Distributed Reinforcement Learning Approach to Pattern Inference in Go
— This paper shows that the distributed representation found in Learning Vector Quantization (LVQ) enables reinforcement learning methods to cope with a large decision search spa...
Myriam Abramson, Harry Wechsler
NCI
2004
185views Neural Networks» more  NCI 2004»
13 years 5 months ago
Hierarchical reinforcement learning with subpolicies specializing for learned subgoals
This paper describes a method for hierarchical reinforcement learning in which high-level policies automatically discover subgoals, and low-level policies learn to specialize for ...
Bram Bakker, Jürgen Schmidhuber
ICDM
2002
IEEE
105views Data Mining» more  ICDM 2002»
13 years 9 months ago
Empirical Comparison of Various Reinforcement Learning Strategies for Sequential Targeted Marketing
We empirically evaluate the performance of various reinforcement learning methods in applications to sequential targeted marketing. In particular, we propose and evaluate a progre...
Naoki Abe, Edwin P. D. Pednault, Haixun Wang, Bian...
IROS
2006
IEEE
113views Robotics» more  IROS 2006»
13 years 10 months ago
Policy Gradient Methods for Robotics
— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...
Jan Peters, Stefan Schaal
ICML
2001
IEEE
14 years 5 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
ICML
2004
IEEE
14 years 5 months ago
Learning to fly by combining reinforcement learning with behavioural cloning
Reinforcement learning deals with learning optimal or near optimal policies while interacting with the environment. Application domains with many continuous variables are difficul...
Eduardo F. Morales, Claude Sammut