In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
We propose a model for level-ups in Heroes of Might and Magic III, and give an O 1 ε2 ln 1 δ learning algorithm to estimate the probabilities of secondary skills induced by any ...
The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend ...
Learning Classifier System (LCS) is an effective tool to solve classification problems. Clustering with XCS (accuracy-based LCS) is a novel approach proposed recently. In this pape...
Recently, we have introduced a novel approach to dynamic programming and reinforcement learning that is based on maintaining explicit representations of stationary distributions i...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...