Search Sciweavers | Sciweavers

14 search results - page 2 / 3

» Near-Optimal Reinforcement Learning in Polynomial Time

click to vote

JAIR
2011

144views more JAIR 2011»

Non-Deterministic Policies in Markovian Decision Processes

12 years 11 months ago

Download www.jair.org

Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making proble...

Mahdi Milani Fard, Joelle Pineau

claim paper

Read More »

click to vote

ICML
2003
IEEE

124views Machine Learning» more ICML 2003»

Exploration in Metric State Spaces

14 years 5 months ago

Download www.cis.upenn.edu

We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...

Sham Kakade, Michael J. Kearns, John Langford

claim paper

Read More »

click to vote

COLT
1993
Springer

126views Machine Learning» more COLT 1993»

Learning Binary Relations Using Weighted Majority Voting

13 years 8 months ago

Download www.soe.ucsc.edu

In this paper we demonstrate how weighted majority voting with multiplicative weight updating can be applied to obtain robust algorithms for learning binary relations. We first pre...

Sally A. Goldman, Manfred K. Warmuth

claim paper

Read More »

click to vote

ICML
2009
IEEE

155views Machine Learning» more ICML 2009»

Near-Bayesian exploration in polynomial time

14 years 5 months ago

Download ai.stanford.edu

We consider the exploration/exploitation problem in reinforcement learning (RL). The Bayesian approach to model-based RL offers an elegant solution to this problem, by considering...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

click to vote

ICML
2006
IEEE

136views Machine Learning» more ICML 2006»

An analytic solution to discrete Bayesian reinforcement learning

14 years 5 months ago

Download www.cs.uwaterloo.ca

Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms co...

Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, Kevi...

claim paper

Read More »

« Prev « First page 2 / 3 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers