Sciweavers

11 search results - page 2 / 3
» Suppressing intersample behavior in Iterative Learning Contr...
Sort
View
CDC
2010
IEEE
136views Control Systems» more  CDC 2010»
13 years 9 days ago
Pathologies of temporal difference methods in approximate dynamic programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Dimitri P. Bertsekas
TON
2008
139views more  TON 2008»
13 years 5 months ago
Stochastic learning solution for distributed discrete power control game in wireless data networks
Distributed power control is an important issue in wireless networks. Recently, noncooperative game theory has been applied to investigate interesting solutions to this problem. Th...
Yiping Xing, Rajarathnam Chandramouli
ICML
1996
IEEE
14 years 6 months ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore
IROS
2007
IEEE
144views Robotics» more  IROS 2007»
13 years 11 months ago
Bipedal walking on rough terrain using manifold control
— This paper presents an algorithm for adapting periodic behavior to gradual shifts in task parameters. Since learning optimal control in high dimensional domains is subject to t...
Tom Erez, William D. Smart
NIPS
1996
13 years 6 months ago
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Scott Davies