Search Sciweavers | Sciweavers

81 search results - page 3 / 17

» The Optimal Reward Baseline for Gradient-Based Reinforcement...

click to vote

CSL
2012
Springer

311views Automated Reasoning» more CSL 2012»

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

12 years 1 months ago

Download mi.eng.cam.ac.uk

Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...

Filip Jurcícek, Blaise Thomson, Steve Young

claim paper

Read More »

click to vote

COLT
2004
Springer

99views Machine Learning» more COLT 2004»

Reinforcement Learning for Average Reward Zero-Sum Games

13 years 11 months ago

Download www.ece.mcgill.ca

Abstract. We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The ﬁrst is based on relative Q-learning and the ...

Shie Mannor

claim paper

Read More »

click to vote

AI
1998
Springer

177views Artificial Intelligence» more AI 1998»

Model-Based Average Reward Reinforcement Learning

13 years 5 months ago

Download web.engr.oregonstate.edu

Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...

Prasad Tadepalli, DoKyeong Ok

claim paper

Read More »

click to vote

CORR
2006
Springer

140views Education» more CORR 2006»

Nearly optimal exploration-exploitation decision thresholds

13 years 5 months ago

Download www.idiap.ch

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...

Christos Dimitrakakis

posted by olethros

Read More »

click to vote

ICML
2006
IEEE

142views Machine Learning» more ICML 2006»

An intrinsic reward mechanism for efficient exploration

14 years 6 months ago

Download www-anw.cs.umass.edu

How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...

Özgür Simsek, Andrew G. Barto

claim paper

Read More »

« Prev « First page 3 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers