policy iteration algorithm

176

CDC
2008
IEEE

206views Control Systems» more CDC 2008»

Approximate dynamic programming using support vector regression

16 years 2 days ago

— This paper presents a new approximate policy iteration algorithm based on support vector regression (SVR). It provides an overview of commonly used cost approximation architect...

Brett Bethke, Jonathan P. How, Asuman E. Ozdaglar

claim paper

Read More »

144

click to vote

ICML
2009
IEEE

172views Machine Learning» more ICML 2009»

Model-free reinforcement learning as mixture learning

16 years 6 months ago

Download user.cs.tu-berlin.de

We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the infinite and finite horizo...

Nikos Vlassis, Marc Toussaint

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers