Kernel-Based Reinforcement Learning on Representative States

8 years 3 months ago
Kernel-Based Reinforcement Learning on Representative States
Markov decision processes (MDPs) are an established framework for solving sequential decision-making problems under uncertainty. In this work, we propose a new method for batchmode reinforcement learning (RL) with continuous state variables. The method is an approximation to kernel-based RL on a set of k representative states. Similarly to kernel-based RL, our solution is a fixed point of a kernelized Bellman operator and can approximate the optimal solution to an arbitrary level of granularity. Unlike kernel-based RL, our method is fast. In particular, our policies can be computed in O(n) time, where n is the number of training examples. The time complexity of kernel-based RL is Ω(n2 ). We introduce our method, analyze its convergence, and compare it to existing work. The method is evaluated on two existing control problems with 2 to 4 continuous variables and a new problem with 64 variables. In all cases, we outperform state-of-the-art results and offer simpler solutions.
Branislav Kveton, Georgios Theocharous
Added 29 Sep 2012
Updated 29 Sep 2012
Type Journal
Year 2012
Where AAAI
Authors Branislav Kveton, Georgios Theocharous
Comments (0)