Sciweavers

771 search results - page 37 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
CORR
2006
Springer
113views Education» more  CORR 2006»
15 years 1 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
184
Voted
CSL
2012
Springer
13 years 9 months ago
Reinforcement learning for parameter estimation in statistical spoken dialogue systems
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...
Filip Jurcícek, Blaise Thomson, Steve Young
QEST
2010
IEEE
14 years 11 months ago
Symblicit Calculation of Long-Run Averages for Concurrent Probabilistic Systems
Abstract--Model checkers for concurrent probabilistic systems have become very popular within the last decade. The study of long-run average behavior has however received only scan...
Ralf Wimmer, Bettina Braitling, Bernd Becker, Erns...
FLAIRS
2004
15 years 3 months ago
State Space Reduction For Hierarchical Reinforcement Learning
er provides new techniques for abstracting the state space of a Markov Decision Process (MDP). These techniques extend one of the recent minimization models, known as -reduction, ...
Mehran Asadi, Manfred Huber
IPPS
2000
IEEE
15 years 6 months ago
A Decision-Process Analysis of Implicit Coscheduling
ThispaperpresentsatheoreticalframeworkbasedonBayesian decision theory for analyzing recently reported results on implicit coscheduling of parallel applications on clusters of work...
Radha Poovendran, Peter J. Keleher, John S. Baras