Sciweavers

829 search results - page 118 / 166
» A time aggregation approach to Markov decision processes
Sort
View
ICML
2009
IEEE
16 years 2 months ago
Piecewise-stationary bandit problems with side observations
We consider a sequential decision problem where the rewards are generated by a piecewise-stationary distribution. However, the different reward distributions are unknown and may c...
Jia Yuan Yu, Shie Mannor
RSA
2000
170views more  RSA 2000»
15 years 1 months ago
Delayed path coupling and generating random permutations
We analyze various stochastic processes for generating permutations almost uniformly at random in distributed and parallel systems. All our protocols are simple, elegant and are b...
Artur Czumaj, Miroslaw Kutylowski
JMLR
2006
124views more  JMLR 2006»
15 years 1 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
ATAL
2006
Springer
15 years 5 months ago
Solving POMDPs using quadratically constrained linear programs
Developing scalable algorithms for solving partially observable Markov decision processes (POMDPs) is an important challenge. One promising approach is based on representing POMDP...
Christopher Amato, Daniel S. Bernstein, Shlomo Zil...
AAAI
2008
15 years 4 months ago
Another Look at Search-Based Drama Management
A drama manager (DM) monitors an interactive experience, such as a computer game, and intervenes to shape the global experience so it satisfies the author's expressive goals ...
Mark J. Nelson, Michael Mateas