Search Sciweavers | Sciweavers

223 search results - page 9 / 45

» Least-Squares Temporal Difference Learning

162

ICCBR
2010
Springer

274views Automated Reasoning» more ICCBR 2010»

Reducing the Memory Footprint of Temporal Difference Learning over Finitely Many States by Using Case-Based Generalization

15 years 10 months ago

Download www.cse.lehigh.edu

In this paper we present an approach for reducing the memory footprint requirement of temporal difference methods in which the set of states is finite. We use case-based generaliza...

Matt Dilts, Héctor Muñoz-Avila

claim paper

Read More »

133

click to vote

AAAI
2007

142views Intelligent Agents» more AAAI 2007»

Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison

15 years 8 months ago

Download staff.science.uva.nl

Reinforcement learning (RL) methods have become popular in recent years because of their ability to solve complex tasks with minimal feedback. Both genetic algorithms (GAs) and te...

Matthew E. Taylor, Shimon Whiteson, Peter Stone

claim paper

Read More »

173

click to vote

ECAI
2000
Springer

90views Artificial Intelligence» more ECAI 2000»

Efficient Asymptotic Approximation in Temporal Difference Learning

15 years 9 months ago

Download www.inra.fr

Abstract. TD(

Frédérick Garcia, Florent Serre

claim paper

Read More »

157

click to vote

NIPS
1993

123views Information Technology» more NIPS 1993»

Temporal Difference Learning of Position Evaluation in the Game of Go

15 years 7 months ago

Download www.gatsby.ucl.ac.uk

The game of Go has a high branching factor that defeats the tree search approach used in computer chess, and long-range spatiotemporal interactions that make position evaluation e...

Nicol N. Schraudolph, Peter Dayan, Terrence J. Sej...

claim paper

Read More »

183

click to vote

ML
2002
ACM

168views Machine Learning» more ML 2002»

On Average Versus Discounted Reward Temporal-Difference Learning

15 years 5 months ago

Download web.mit.edu

We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...

John N. Tsitsiklis, Benjamin Van Roy

claim paper

Read More »

« Prev « First page 9 / 45 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers