Sciweavers

ATAL
2010
Springer

Closing the learning-planning loop with predictive state representations

13 years 5 months ago
Closing the learning-planning loop with predictive state representations
A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex for planning to be feasible; or, they require large amounts of prior domain knowledge or fail to provide important guarantees such as statistical consistency. To begin to fill this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of action-observation pairs. To evaluate the learner, we then close the loop from observations to actions: we plan in the learned model and recover a policy which is nearoptimal in the original environment (not the model). In more detail, we present a spectral algorithm for learning a Predictive State Representation (PSR). We demonstrate the algorithm by...
Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2010
Where ATAL
Authors Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon
Comments (0)