Search Sciweavers | Sciweavers

771 search results - page 9 / 155

» Markov Decision Processes with Arbitrary Reward Processes

179

click to vote

FOCS
2007
IEEE

157views Theoretical Computer Science» more FOCS 2007»

Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

16 years 1 days ago

Download www.cis.upenn.edu

We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...

Sudipto Guha, Kamesh Munagala

claim paper

Read More »

180

Voted

IJCAI
2007

254views Artificial Intelligence» more IJCAI 2007»

Bayesian Inverse Reinforcement Learning

15 years 7 months ago

Download www.ijcai.org

Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...

Deepak Ramachandran, Eyal Amir

claim paper

Read More »

161

click to vote

IJCAI
2001

174views Artificial Intelligence» more IJCAI 2001»

Complexity of Probabilistic Planning under Average Rewards

15 years 7 months ago

Download www.informatik.uni-freiburg.de

A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...

Jussi Rintanen

claim paper

Read More »

159

click to vote

AAAI
2010

136views Intelligent Agents» more AAAI 2010»

Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies

15 years 7 months ago

Download www.cs.toronto.edu

The precise specification of reward functions for Markov decision processes (MDPs) is often extremely difficult, motivating research into both reward elicitation and the robust so...

Kevin Regan, Craig Boutilier

claim paper

Read More »

199

click to vote

JMLR
2010

189views more JMLR 2010»

Adaptive Step-size Policy Gradients with Average Reward Metric

15 years 15 days ago

Download jmlr.csail.mit.edu

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

« Prev « First page 9 / 155 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers