Sciweavers

1249 search results - page 202 / 250
» State Machine Modeling: From Synch States to Synchronized St...
Sort
View
ICML
2000
IEEE
16 years 18 days ago
Eligibility Traces for Off-Policy Policy Evaluation
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
RV
2010
Springer
220views Hardware» more  RV 2010»
14 years 9 months ago
Runtime Verification with the RV System
The RV system is the first system to merge the benefits of Runtime Monitoring with Predictive Analysis. The Runtime Monitoring portion of RV is based on the successful Monitoring O...
Patrick O'Neil Meredith, Grigore Rosu
ICML
1999
IEEE
16 years 18 days ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
ICML
1994
IEEE
15 years 3 months ago
A Modular Q-Learning Architecture for Manipulator Task Decomposition
Compositional Q-Learning (CQ-L) (Singh 1992) is a modular approach to learning to performcomposite tasks made up of several elemental tasks by reinforcement learning. Skills acqui...
Chen K. Tham, Richard W. Prager
ICML
2003
IEEE
15 years 5 months ago
The Influence of Reward on the Speed of Reinforcement Learning: An Analysis of Shaping
Shaping can be an effective method for improving the learning rate in reinforcement systems. Previously, shaping has been heuristically motivated and implemented. We provide a for...
Adam Laud, Gerald DeJong