Sciweavers

682 search results - page 137 / 137
» One-Counter Markov Decision Processes
Sort
View
NECO
2007
150views more  NECO 2007»
13 years 4 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
ICML
1996
IEEE
14 years 6 months ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore