Search Sciweavers | Sciweavers

23 search results - page 4 / 5

» Learning from Reinforcement and Advice Using Composite Rewar...

222

click to vote

ICML
2000
IEEE

153views Machine Learning» more ICML 2000»

Eligibility Traces for Off-Policy Policy Evaluation

16 years 8 months ago

Download www.cs.ualberta.ca

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...

Doina Precup, Richard S. Sutton, Satinder P. Singh

claim paper

Read More »

224

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

15 years 2 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

228

click to vote

ICML
1996
IEEE

162views Machine Learning» more ICML 1996»

Learning Evaluation Functions for Large Acyclic Domains

16 years 8 months ago

Download www.ri.cmu.edu

Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...

Justin A. Boyan, Andrew W. Moore

claim paper

Read More »

217

click to vote

IJCAI
2007

201views Artificial Intelligence» more IJCAI 2007»

Using Linear Programming for Bayesian Exploration in Markov Decision Processes

15 years 8 months ago

Download www.cs.mcgill.ca

A key problem in reinforcement learning is ﬁnding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...

Pablo Samuel Castro, Doina Precup

claim paper

Read More »

227

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 8 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

« Prev « First page 4 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers