We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
—This paper presents a method for learning decision theoretic models of human behaviors from video data. Our system learns relationships between the movements of a person, the co...
A long-lived agent continually faces new tasks in its environment. Such an agent may be able to use knowledge learned in solving earlier tasks to produce candidate policies for it...
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
The software suite we will demonstrate at AAAI '06 was designed around planning with factored Markov decision processes (MDPs). It is a user-friendly suite that facilitates d...
Krol Kevin Mathias, Casey Lengacher, Derek William...