— Accurate and fast control of wheel speeds in the presence of noise and nonlinearities is one of the crucial requirements for building fast mobile robots, as they are required i...
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function....
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
We use reinforcement learning (RL) to compute strategies for multiagent soccer teams. RL may pro t signi cantly from world models (WMs) estimating state transition probabilities an...
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...