Sciweavers

38 search results - page 7 / 8
» On the Convergence of Optimistic Policy Iteration
Sort
View
CORR
2010
Springer
170views Education» more  CORR 2010»
13 years 5 months ago
Global Optimization for Value Function Approximation
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bili...
Marek Petrik, Shlomo Zilberstein
INFOCOM
1995
IEEE
13 years 9 months ago
Complexity of Gradient Projection Method for Optimal Routing in Data Networks
—In this paper, we derive a time-complexity bound for the gradient projection method for optimal routing in data networks. This result shows that the gradient projection algorith...
Wei Kang Tsai, John K. Antonio, Garng M. Huang
AIPS
2011
12 years 9 months ago
Heuristic Search for Generalized Stochastic Shortest Path MDPs
Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs...
Andrey Kolobov, Mausam, Daniel S. Weld, Hector Gef...
IPPS
2002
IEEE
13 years 10 months ago
Optimal Remapping in Dynamic Bulk Synchronous Computations via a Stochastic Control Approach
A bulk synchronous computation proceeds in phases that are separated by barrier synchronization. For dynamic bulk synchronous computations that exhibit varying phase-wise computat...
Gang George Yin, Cheng-Zhong Xu, Le Yi Wang
UAI
2004
13 years 7 months ago
Discretized Approximations for POMDP with Average Cost
In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a fi...
Huizhen Yu, Dimitri P. Bertsekas