Sciweavers

55 search results - page 5 / 11
» Approximate Policy Iteration using Large-Margin Classifiers
Sort
View
CDC
2010
IEEE
136views Control Systems» more  CDC 2010»
13 years 23 days ago
Pathologies of temporal difference methods in approximate dynamic programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Dimitri P. Bertsekas
ICML
2006
IEEE
14 years 6 months ago
Fast direct policy evaluation using multiscale analysis of Markov diffusion processes
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3 ) to directly solve the Bellman system of |S...
Mauro Maggioni, Sridhar Mahadevan
PKDD
2009
Springer
169views Data Mining» more  PKDD 2009»
14 years 9 days ago
Hybrid Least-Squares Algorithms for Approximate Policy Evaluation
The goal of approximate policy evaluation is to “best” represent a target value function according to a specific criterion. Temporal difference methods and Bellman residual m...
Jeffrey Johns, Marek Petrik, Sridhar Mahadevan
TIT
2010
115views Education» more  TIT 2010»
13 years 14 days ago
On resource allocation in fading multiple-access channels-an efficient approximate projection approach
We consider the problem of rate and power allocation in a multiple-access channel. Our objective is to obtain rate and power allocation policies that maximize a general concave ut...
Ali ParandehGheibi, Atilla Eryilmaz, Asuman E. Ozd...
PKDD
2010
Springer
164views Data Mining» more  PKDD 2010»
13 years 3 months ago
Efficient Planning in Large POMDPs through Policy Graph Based Factorized Approximations
Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...
Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...