Search Sciweavers | Sciweavers

162 search results - page 17 / 33

» Off-Policy Temporal Difference Learning with Function Approx...

237

click to vote

ESANN
2008

278views Neural Networks» more ESANN 2008»

Learning to play Tetris applying reinforcement learning methods

15 years 8 months ago

Download www.dice.ucl.ac.be

In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value funct...

Alexander Groß, Jan Friedland, Friedhelm Sch...

claim paper

Read More »

169

click to vote

CG
2006
Springer

155views Computer Graphics» more CG 2006»

Feature Construction for Reinforcement Learning in Hearts

15 years 8 months ago

Download webdocs.cs.ualberta.ca

Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search...

Nathan R. Sturtevant, Adam M. White

claim paper

Read More »

159

click to vote

CORR
2006
Springer

109views Education» more CORR 2006»

Decision Making with Side Information and Unbounded Loss Functions

15 years 6 months ago

Download www.hpl.hp.com

We consider the problem of decision-making with side information and unbounded loss functions. Inspired by probably approximately correct learning model, we use a slightly differe...

Majid Fozunbal, Ton Kalker

claim paper

Read More »

140

click to vote

AAAI
2008

100views Intelligent Agents» more AAAI 2008»

Strategyproof Classification under Constant Hypotheses: A Tale of Two Functions

15 years 8 months ago

Download www.aaai.org

We consider the following setting: a decision maker must make a decision based on reported data points with binary labels. Subsets of data points are controlled by different selfi...

Reshef Meir, Ariel D. Procaccia, Jeffrey S. Rosens...

claim paper

Read More »

177

click to vote

NIPS
1998

140views Information Technology» more NIPS 1998»

Gradient Descent for General Reinforcement Learning

15 years 7 months ago

Download www.ri.cmu.edu

A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcementlearning algorithms. These algorithms solve a number ...

Leemon C. Baird III, Andrew W. Moore

claim paper

Read More »

« Prev « First page 17 / 33 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers