Sciweavers

771 search results - page 8 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
ICML
2007
IEEE
16 years 15 days ago
Automatic shaping and decomposition of reward functions
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begi...
Bhaskara Marthi
ICML
2001
IEEE
16 years 15 days ago
Continuous-Time Hierarchical Reinforcement Learning
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...
Mohammad Ghavamzadeh, Sridhar Mahadevan
AAAI
2007
15 years 2 months ago
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...
Colin McMillen, Manuela M. Veloso
104
Voted
AIPS
2006
15 years 1 months ago
Probabilistic Planning with Nonlinear Utility Functions
Researchers often express probabilistic planning problems as Markov decision process models and then maximize the expected total reward. However, it is often rational to maximize ...
Yaxin Liu, Sven Koenig
ICML
2004
IEEE
16 years 15 days ago
Apprenticeship learning via inverse reinforcement learning
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
Pieter Abbeel, Andrew Y. Ng