Sciweavers

350 search results - page 15 / 70
» Incremental profile learning based on a reinforcement method
Sort
View
CG
2006
Springer
14 years 11 months ago
Feature Construction for Reinforcement Learning in Hearts
Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search...
Nathan R. Sturtevant, Adam M. White
NN
2010
Springer
125views Neural Networks» more  NN 2010»
14 years 8 months ago
Parameter-exploring policy gradients
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
Frank Sehnke, Christian Osendorfer, Thomas Rü...
IWLCS
2005
Springer
15 years 3 months ago
Counter Example for Q-Bucket-Brigade Under Prediction Problem
Aiming to clarify the convergence or divergence conditions for Learning Classifier System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...
Atsushi Wada, Keiki Takadama, Katsunori Shimohara
NN
2002
Springer
113views Neural Networks» more  NN 2002»
14 years 9 months ago
Control of exploitation-exploration meta-parameter in reinforcement learning
In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance betwe...
Shin Ishii, Wako Yoshida, Junichiro Yoshimoto
AIMSA
2006
Springer
15 years 1 months ago
Machine Learning for Spoken Dialogue Management: An Experiment with Speech-Based Database Querying
Although speech and language processing techniques achieved a relative maturity during the last decade, designing a spoken dialogue system is still a tailoring task because of the ...
Olivier Pietquin