Sciweavers

651 search results - page 86 / 131
» Algorithms for Inverse Reinforcement Learning
Sort
View
AAAI
2006
14 years 11 months ago
Modeling Human Decision Making in Cliff-Edge Environments
In this paper we propose a model for human learning and decision making in environments of repeated Cliff-Edge (CE) interactions. In CE environments, which include common daily in...
Ron Katz, Sarit Kraus
ICML
2010
IEEE
14 years 7 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
AROBOTS
2002
115views more  AROBOTS 2002»
14 years 9 months ago
Statistical Learning for Humanoid Robots
The complexity of the kinematic and dynamic structure of humanoid robots make conventional analytical approaches to control increasingly unsuitable for such systems. Learning techn...
Sethu Vijayakumar, Aaron D'Souza, Tomohiro Shibata...

Publication
222views
15 years 6 months ago
Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration
Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...
Christos Dimitrakakis, Michail G. Lagoudakis
COLT
2010
Springer
14 years 7 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura