We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret w...
: In this paper we obtain new effective results on the Halpern iterations of nonexpansive mappings using methods from mathematical logic or, more specifically, proof-theoretic te...
—This paper presents an iterative receiver for Multiple-Input Multiple-Output (MIMO) Orthogonal Frequency Division Multiplexing (OFDM) systems. The receiver performs channel esti...
We present an iterative Markov chain Monte Carlo algorithm for computing reference priors and minimax risk for general parametric families. Our approach uses MCMC techniques based...
In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...