The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their ...
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
We study the stochastic machine replenishment problem, which is a canonical special case of closed multiclass queuing systems in Markov decision theory. The problem models the sche...
We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified fra...
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...