We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
— In this paper, the problem of stabilization of unknown nonlinear dynamical systems is considered. An adaptive feedback law is constructed that is based on the switching adaptiv...
— Closed-loop, asymptotically stable walking motions are designed for a 5-link, planar bipedal robot model with one degree of underactuation. Parameter optimization is applied to...
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...
To appear in: G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, MIT Press, Cambridge MA, 1995. A straightforward approach to t...