Search Sciweavers | Sciweavers

1233 search results - page 162 / 247

» Reinforcement learning

167

click to vote

AAAI
2006

127views Intelligent Agents» more AAAI 2006»

Modeling Human Decision Making in Cliff-Edge Environments

15 years 7 months ago

Download www.aaai.org

In this paper we propose a model for human learning and decision making in environments of repeated Cliff-Edge (CE) interactions. In CE environments, which include common daily in...

Ron Katz, Sarit Kraus

claim paper

Read More »

185

click to vote

ICML
2000
IEEE

153views Machine Learning» more ICML 2000»

Eligibility Traces for Off-Policy Policy Evaluation

16 years 7 months ago

Download www.cs.ualberta.ca

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...

Doina Precup, Richard S. Sutton, Satinder P. Singh

claim paper

Read More »

165

click to vote

SMC
2007
IEEE

102views Control Systems» more SMC 2007»

An improved immune Q-learning algorithm

16 years 11 days ago

Download web2.uwindsor.ca

—Reinforcement learning is a framework in which an agent can learn behavior without knowledge on a task or an environment by exploration and exploitation. Striking a balance betw...

Zhengqiao Ji, Q. M. Jonathan Wu, Maher A. Sid-Ahme...

claim paper

Read More »

174

click to vote

IROS
2006
IEEE

113views Robotics» more IROS 2006»

Policy Gradient Methods for Robotics

16 years 4 days ago

Download www.cs.utah.edu

— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...

Jan Peters, Stefan Schaal

claim paper

Read More »

189

Voted

ECML
2005
Springer

193views Machine Learning» more ECML 2005»

Natural Actor-Critic

15 years 11 months ago

Download www-clmc.usc.edu

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...

Jan Peters, Sethu Vijayakumar, Stefan Schaal

claim paper

Read More »

« Prev « First page 162 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers