Sciweavers

71 search results - page 1 / 15
» An Analysis of Direct Reinforcement Learning in Non-Markovia...
Sort
View
NIPS
2001
13 years 6 months ago
Reinforcement Learning with Long Short-Term Memory
This paper presents reinforcement learning with a Long ShortTerm Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage learning and directed exploration can...
Bram Bakker
AGENTS
1999
Springer
13 years 9 months ago
Team-Partitioned, Opaque-Transition Reinforcement Learning
In this paper, we present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the concept of usin...
Peter Stone, Manuela M. Veloso
JCP
2007
143views more  JCP 2007»
13 years 4 months ago
Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization
Abstract— We describe a general method to transform a non-Markovian sequential decision problem into a supervised learning problem using a K-bestpaths algorithm. We consider an a...
Nicolas Chapados, Yoshua Bengio
ICML
2008
IEEE
14 years 5 months ago
A worst-case comparison between temporal difference and residual gradient with linear function approximation
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Lihong Li