Search Sciweavers | Sciweavers

97 search results - page 9 / 20

» Guiding Inference with Policy Search Reinforcement Learning

click to vote

ACL
2009

123views Computational Linguistics» more ACL 2009»

Reinforcement Learning for Mapping Instructions to Actions

14 years 9 months ago

Download www.aclweb.org

In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...

S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...

claim paper

Read More »

106

click to vote

ICML
2001
IEEE

159views Machine Learning» more ICML 2001»

Direct Policy Search using Paired Statistical Tests

16 years 13 days ago

Download www.autonlab.org

Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...

Malcolm J. A. Strens, Andrew W. Moore

claim paper

Read More »

click to vote

ICML
2006
IEEE

103views Machine Learning» more ICML 2006»

Using inaccurate models in reinforcement learning

16 years 13 days ago

Download ai.stanford.edu

In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for ...

Pieter Abbeel, Morgan Quigley, Andrew Y. Ng

claim paper

Read More »

click to vote

ICML
2009
IEEE

131views Machine Learning» more ICML 2009»

Monte-Carlo simulation balancing

16 years 13 days ago

Download www.cs.ualberta.ca

In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...

David Silver, Gerald Tesauro

claim paper

Read More »

Voted

ICML
2010
IEEE

282views Machine Learning» more ICML 2010»

Bayesian Multi-Task Reinforcement Learning

15 years 21 days ago

Download hal.inria.fr

We consider the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any g...

Alessandro Lazaric, Mohammad Ghavamzadeh

claim paper

Read More »

« Prev « First page 9 / 20 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers