A Monte-Carlo AIXI Approximation

13 years 1 months ago
A Monte-Carlo AIXI Approximation
This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suffix trees of a fixed maximum depth. This allows a Bayesian mixture of environment models to be computed in time proportional to the logarithm of the size of the model class. Secondly, the finite-horizon expectimax search is approximated by an asymptotically convergent Monte Carlo Tree Search technique. This scaled down AIXI agent is empirically shown to be effective on a wide class of toy problem domains, ranging from simple fully observable games to small POMDPs. We explore the limits of this approximate agent and propose a general heuristic framework for scaling this technique to much larger problems. Contents
Joel Veness, Kee Siong Ng, Marcus Hutter, William
Added 14 May 2011
Updated 14 May 2011
Type Journal
Year 2011
Where JAIR
Authors Joel Veness, Kee Siong Ng, Marcus Hutter, William T. B. Uther, David Silver
Comments (0)