Planning in partially-observable dynamical systems is a challenging problem, and recent developments in point-based techniques such as Perseus significantly improve performance as...
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and p...
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
Abstract. Trained support vector machines (SVMs) have a slow runtime classification speed if the classification problem is noisy and the sample data set is large. Approximating the...
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...