Search Sciweavers | Sciweavers

170 search results - page 4 / 34

» Learning to play Tetris applying reinforcement learning meth...

click to vote

SARA
2005
Springer

102views Artificial Intelligence» more SARA 2005»

Feature-Discovering Approximate Value Iteration Methods

15 years 5 months ago

Download cobweb.ecn.purdue.edu

Sets of features in Markov decision processes can play a critical role ximately representing value and in abstracting the state space. Selection of features is crucial to the succe...

Jia-Hong Wu, Robert Givan

claim paper

Read More »

121

Voted

ICML
2003
IEEE

150views Machine Learning» more ICML 2003»

The Significance of Temporal-Difference Learning in Self-Play Training TD-Rummy versus EVO-rummy

15 years 5 months ago

Download www.hpl.hp.com

Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of ...

Clifford Kotnik, Jugal K. Kalita

claim paper

Read More »

100

click to vote

ROBOCUP
2007
Springer

167views Robotics» more ROBOCUP 2007»

Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others

15 years 6 months ago

Download www.er.ams.eng.osaka-u.ac.jp

The existing reinforcement learning approaches have been suﬀering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical...

Kentarou Noma, Yasutake Takahashi, Minoru Asada

claim paper

Read More »

116

click to vote

AR
2008

118views more AR 2008»

Efficient Behavior Learning Based on State Value Estimation of Self and Others

15 years 14 days ago

Download www.er.ams.eng.osaka-u.ac.jp

The existing reinforcement learning methods have been seriously suffering from the curse of dimension problem especially when they are applied to multiagent dynamic environments. ...

Yasutake Takahashi, Kentarou Noma, Minoru Asada

claim paper

Read More »

101

Voted

ML
1998
ACM

136views Machine Learning» more ML 1998»

Co-Evolution in the Successful Learning of Backgammon Strategy

15 years 13 hour ago

Download www.demo.cs.brandeis.edu

Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of t...

Jordan B. Pollack, Alan D. Blair

claim paper

Read More »

« Prev « First page 4 / 34 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers