Sciweavers

20 search results - page 1 / 4
» Sigma point policy iteration
Sort
View
ATAL
2008
Springer
15 years 3 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
106
Voted
AAAI
2007
15 years 3 months ago
Point-Based Policy Iteration
We describe a point-based policy iteration (PBPI) algorithm for infinite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...
Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...
86
Voted
NIPS
2003
15 years 2 months ago
Approximate Policy Iteration with a Policy Language Bias
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual...
Alan Fern, Sung Wook Yoon, Robert Givan
104
Voted
AAAI
2006
15 years 2 months ago
Incremental Least Squares Policy Iteration for POMDPs
We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...
Hui Li, Xuejun Liao, Lawrence Carin
113
Voted
CORR
2008
Springer
115views Education» more  CORR 2008»
15 years 1 months ago
Adaptive Sum Power Iterative Waterfilling for MIMO Cognitive Radio Channels
Abstract--In this paper, the sum capacity of the Gaussian Multiple Input Multiple Output (MIMO) Cognitive Radio Channel (MCC) is expressed as a convex problem with finite number of...
Rajiv Soundararajan, Sriram Vishwanath