Hill-climbing has been shown to be more effective than exhaustive search in solving satisfiability problems.Also, it has been used either by itself or in combination with other ...
Abstract. We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the ...
POMDPs and their decentralized multiagent counterparts, DEC-POMDPs, offer a rich framework for sequential decision making under uncertainty. Their computational complexity, howeve...
Christopher Amato, Daniel S. Bernstein, Shlomo Zil...
In this paper, we study online algorithms when the input is not chosen adversarially, but consists of draws from some given probability distribution. While this model has been stu...
Abstract. We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G = (V = VB VW VR, E), with local rewards r : E R...
Endre Boros, Khaled M. Elbassioni, Vladimir Gurvic...