Abstract. Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for...
—The concept of cooperative retransmission in wireless networks has attracted considerable research attention. The basic idea is that when a receiver cannot decode a frame, the r...
We study online learning in an oblivious changing environment. The standard measure of regret bounds the difference between the cost of the online learner and the best decision in...
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
A general game player is an agent capable of taking as input a description of a game’s rules in a formal language and proceeding to play without any subsequent human input. To do...