Sciweavers

ORL
2007
70views more  ORL 2007»
13 years 3 months ago
Linear dependence of stationary distributions in ergodic Markov decision processes
In ergodic MDPs we consider stationary distributions of policies that coincide in all but n states, in which one of two possible actions is chosen. We give conditions and formulas...
Ronald Ortner
CORR
2010
Springer
110views Education» more  CORR 2010»
13 years 3 months ago
Mixing Time and Stationary Expected Social Welfare of Logit Dynamics
We study logit dynamics [3] for strategic games. At every stage of the game a player is selected uniformly at random and she is assumed to play according to a noisy best-response ...
Vincenzo Auletta, Diodato Ferraioli, Francesco Pas...
ICML
2010
IEEE
13 years 4 months ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
WSC
1998
13 years 5 months ago
Stopping Criterion for a Simulation-Based Optimization Method
We consider a new simulation-based optimization method called the Nested Partitions (NP) method. This method generates a Markov chain and solving the optimization problem is equiv...
Sigurdur Ólafsson, Leyuan Shi
DAGSTUHL
2006
13 years 5 months ago
How fast does the stationary distribution of the Markov chain modelling EAs concentrate on the homogeneous populations for small
One of the main difficulties faced when analyzing Markov chains modelling evolutionary algorithms is that their cardinality grows quite fast. A reasonable way to deal with this iss...
Boris Mitavskiy, Jonathan E. Rowe