Sciweavers

85 search results - page 2 / 17
» Approximate Policy Iteration with a Policy Language Bias
Sort
View
ML
2008
ACM
152views Machine Learning» more  ML 2008»
13 years 4 months ago
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...
András Antos, Csaba Szepesvári, R&ea...
AAAI
2007
13 years 7 months ago
Point-Based Policy Iteration
We describe a point-based policy iteration (PBPI) algorithm for infinite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...
Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...

Publication
334views
14 years 1 months ago
Rollout Sampling Approximate Policy Iteration
Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schem...
Christos Dimitrakakis, Michail G. Lagoudakis
NIPS
2003
13 years 6 months ago
Bounded Finite State Controllers
We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic fini...
Pascal Poupart, Craig Boutilier
NIPS
2001
13 years 6 months ago
Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
Gregory Z. Grudic, Lyle H. Ungar