Search Sciweavers | Sciweavers

166 search results - page 19 / 34

» Online model learning in adversarial Markov decision process...

click to vote

JMLR
2010

125views more JMLR 2010»

Variational methods for Reinforcement Learning

14 years 6 months ago

Download jmlr.csail.mit.edu

We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...

Thomas Furmston, David Barber

claim paper

Read More »

154

click to vote

CSL
2012
Springer

311views Automated Reasoning» more CSL 2012»

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

13 years 7 months ago

Download mi.eng.cam.ac.uk

Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...

Filip Jurcícek, Blaise Thomson, Steve Young

claim paper

Read More »

click to vote

AAAI
2006

108views Intelligent Agents» more AAAI 2006»

Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains

15 years 25 days ago

Download www.eecs.umich.edu

We examine the problem of Transfer in Reinforcement Learning and present a method to utilize knowledge acquired in one Markov Decision Process (MDP) to bootstrap learning in a mor...

Vishal Soni, Satinder P. Singh

claim paper

Read More »

124

click to vote

JMLR
2010

189views more JMLR 2010»

Adaptive Step-size Policy Gradients with Average Reward Metric

14 years 6 months ago

Download jmlr.csail.mit.edu

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

click to vote

ECAI
2008
Springer

114views Artificial Intelligence» more ECAI 2008»

A hybrid approach to multi-agent decision-making

15 years 16 days ago

Download www.deetc.isel.ipl.pt

Abstract. In the aftermath of a large-scale disaster, agents’ decisions derive from self-interested (e.g. survival), common-good (e.g. victims’ rescue) and teamwork (e.g. ﬁre...

Paulo Trigo, Helder Coelho

claim paper

Read More »

« Prev « First page 19 / 34 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers