Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental ...
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
The efficient solution of large-scale linear and nonlinear optimization problems may require exploiting any special structure in them in an efficient manner. We describe and analy...
Antonio J. Conejo, Francisco J. Nogales, Francisco...
In this paper we address a general Goal Programming problem with linear objectives, convex constraints, and an arbitrary componentwise nondecreasing norm to aggregate deviations w...