In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
This paper presents an algorithm for robust optimal control of regular languages under specified uncertainty bounds on the event cost parameters of the language measure that has b...
— This paper explores optimization of paging and registration policies in cellular networks. Motion is modeled as a discrete-time Markov process, and minimization of the discount...
Abstract. Tikhonov regularization often is applied with a finite difference regularization operator that approximates a low-order derivative. This paper proposes the use of ortho...
We describe a point-based policy iteration (PBPI) algorithm for infinite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...
Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...