Sciweavers

334 search results - page 43 / 67
» How to Dynamically Merge Markov Decision Processes
Sort
View
ATAL
2009
Springer
15 years 7 months ago
Improving adjustable autonomy strategies for time-critical domains
As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting humanmultiagent teams adapt to time-critical domains. I...
Nathan Schurr, Janusz Marecki, Milind Tambe
91
Voted
ICML
2007
IEEE
16 years 1 months ago
Constructing basis functions from directed graphs for value function approximation
Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success o...
Jeffrey Johns, Sridhar Mahadevan
ICML
2007
IEEE
16 years 1 months ago
Automatic shaping and decomposition of reward functions
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begi...
Bhaskara Marthi
84
Voted
ICML
2008
IEEE
16 years 1 months ago
Apprenticeship learning using linear programming
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
Umar Syed, Michael H. Bowling, Robert E. Schapire
ICML
2001
IEEE
16 years 1 months ago
Continuous-Time Hierarchical Reinforcement Learning
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...
Mohammad Ghavamzadeh, Sridhar Mahadevan