Sciweavers

ECML
2006
Springer

Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

13 years 7 months ago
Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that mitigate each of these curses. To handle the state-space explosion, we introduce "tabular linear functions" that generalize tile-coding and linear value functions. Action space complexity is reduced by replacing complete joint action space search with a form of hill climbing. To deal with high stochasticity, we introduce a new algorithm called ASH-learning, which is an afterstate version of H-Learning. Our extensions make it practical to apply reinforcement learning to a domain of product delivery - an optimization problem that combines inventory control and vehicle routing.
Scott Proper, Prasad Tadepalli
Added 22 Aug 2010
Updated 22 Aug 2010
Type Conference
Year 2006
Where ECML
Authors Scott Proper, Prasad Tadepalli
Comments (0)