— Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle realworld sequential decision processes but require a known model to be solv...
Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions re...
Marek Petrik, Gavin Taylor, Ronald Parr, Shlomo Zi...
—We propose a dynamic spectrum access scheme where secondary users recommend “good” channels to each other and access accordingly. We formulate the problem as an average rewa...
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The eval...
This paper treats the solution of nonlinear optimization problems involving discrete decision variables, also known as generalized disjunctive programming (GDP) or mixed-integer n...