Sciweavers

9 search results - page 1 / 2
» A gradient-based reinforcement learning approach to dynamic ...
Sort
View
UAI
2001
13 years 6 months ago
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
Lex Weaver, Nigel Tao
IROS
2009
IEEE
206views Robotics» more  IROS 2009»
13 years 11 months ago
Bayesian reinforcement learning in continuous POMDPs with gaussian processes
— Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle realworld sequential decision processes but require a known model to be solv...
Patrick Dallaire, Camille Besse, Stéphane R...
WECWIS
2003
IEEE
120views ECommerce» more  WECWIS 2003»
13 years 9 months ago
Reinforcement Learning Applications in Dynamic Pricing of Retail Markets
In this paper, we investigate the use of reinforcement learning (RL) techniques to the problem of determining dynamic prices in an electronic retail market. As representative mode...
C. V. L. Raju, Y. Narahari, K. Ravikumar
CSE
2008
IEEE
13 years 11 months ago
Adaptation to Dynamic Resource Availability in Ad Hoc Grids through a Learning Mechanism
Ad-hoc Grids are highly heterogeneous and dynamic networks, one of the main challenges of resource allocation in such environments is to find mechanisms which do not rely on the ...
Behnaz Pourebrahimi, Koen Bertels