Search Sciweavers | Sciweavers

473 search results - page 52 / 95

» Optimal policy switching algorithms for reinforcement learni...

click to vote

ECAI
2006
Springer

245views Artificial Intelligence» more ECAI 2006»

Least Squares SVM for Least Squares TD Learning

15 years 3 months ago

Download homepages.feis.herts.ac.uk

Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...

Tobias Jung, Daniel Polani

claim paper

Read More »

click to vote

ESANN
2007

148views Neural Networks» more ESANN 2007»

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

15 years 1 months ago

Download www.dice.ucl.ac.be

In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natur...

Jan Peters, Stefan Schaal

claim paper

Read More »

click to vote

ECML
2006
Springer

116views Machine Learning» more ECML 2006»

Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

15 years 3 months ago

Download web.engr.oregonstate.edu

Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that ...

Scott Proper, Prasad Tadepalli

claim paper

Read More »

click to vote

ML
2002
ACM

133views Machine Learning» more ML 2002»

Finite-time Analysis of the Multiarmed Bandit Problem

14 years 11 months ago

Download homes.dsi.unimi.it

Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...

Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...

claim paper

Read More »

100

click to vote

CORR
2007
Springer

73views Education» more CORR 2007»

Universal Reinforcement Learning

14 years 11 months ago

Download www.stanford.edu

—We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can inﬂuence futu...

Vivek F. Farias, Ciamac Cyrus Moallemi, Tsachy Wei...

claim paper

Read More »

« Prev « First page 52 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers