Search Sciweavers | Sciweavers

771 search results - page 8 / 155

» Markov Decision Processes with Arbitrary Reward Processes

Voted

ICML
2007
IEEE

162views Machine Learning» more ICML 2007»

Automatic shaping and decomposition of reward functions

16 years 2 months ago

Download www.machinelearning.org

This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begi...

Bhaskara Marthi

claim paper

Read More »

122

click to vote

ICML
2001
IEEE

172views Machine Learning» more ICML 2001»

Continuous-Time Hierarchical Reinforcement Learning

16 years 2 months ago

Download www.cs.ualberta.ca

Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

136

Voted

AAAI
2007

102views Intelligent Agents» more AAAI 2007»

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games

15 years 4 months ago

Download www.cs.cmu.edu

In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...

Colin McMillen, Manuela M. Veloso

claim paper

Read More »

124

click to vote

AIPS
2006

130views Artificial Intelligence» more AIPS 2006»

Probabilistic Planning with Nonlinear Utility Functions

15 years 3 months ago

Download www.aaai.org

Researchers often express probabilistic planning problems as Markov decision process models and then maximize the expected total reward. However, it is often rational to maximize ...

Yaxin Liu, Sven Koenig

claim paper

Read More »

125

click to vote

ICML
2004
IEEE

214views Machine Learning» more ICML 2004»

Apprenticeship learning via inverse reinforcement learning

16 years 2 months ago

Download ai.stanford.edu

We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...

Pieter Abbeel, Andrew Y. Ng

claim paper

Read More »

« Prev « First page 8 / 155 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers