Search Sciweavers | Sciweavers

651 search results - page 86 / 131

» Algorithms for Inverse Reinforcement Learning

click to vote

AAAI
2006

127views Intelligent Agents» more AAAI 2006»

Modeling Human Decision Making in Cliff-Edge Environments

15 years 1 months ago

Download www.aaai.org

In this paper we propose a model for human learning and decision making in environments of repeated Cliff-Edge (CE) interactions. In CE environments, which include common daily in...

Ron Katz, Sarit Kraus

claim paper

Read More »

106

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

14 years 9 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

126

click to vote

AROBOTS
2002

115views more AROBOTS 2002»

Statistical Learning for Humanoid Robots

14 years 11 months ago

Download www-clmc.usc.edu

The complexity of the kinematic and dynamic structure of humanoid robots make conventional analytical approaches to control increasingly unsuitable for such systems. Learning techn...

Sethu Vijayakumar, Aaron D'Souza, Tomohiro Shibata...

claim paper

Read More »

106

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

15 years 8 months ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

108

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

14 years 10 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

« Prev « First page 86 / 131 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers