Sciweavers

ISDA
2009
IEEE

Postponed Updates for Temporal-Difference Reinforcement Learning

13 years 11 months ago
Postponed Updates for Temporal-Difference Reinforcement Learning
This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based RL. By recording the agent’s last-visit experience, the agent can delay its update until the given state is revisited, thereby improving the quality of the update. Experimental results demonstrate that postponed updates outperforms several competitors, most notably eligibility traces, a traditional way to improve the sample efficiency of TD methods. It achieves this without the need to tune an extra parameter as is needed for eligibility traces.
Harm van Seijen, Shimon Whiteson
Added 24 May 2010
Updated 24 May 2010
Type Conference
Year 2009
Where ISDA
Authors Harm van Seijen, Shimon Whiteson
Comments (0)