Search Sciweavers | Sciweavers

38 search results - page 1 / 8

» On the Convergence of Optimistic Policy Iteration

click to vote

JMLR
2002

100views more JMLR 2002»

On the Convergence of Optimistic Policy Iteration

13 years 4 months ago

Download www.mit.edu

We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...

John N. Tsitsiklis

claim paper

Read More »

click to vote

ICML
2009
IEEE

172views Machine Learning» more ICML 2009»

Model-free reinforcement learning as mixture learning

14 years 5 months ago

Download user.cs.tu-berlin.de

We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the infinite and finite horizo...

Nikos Vlassis, Marc Toussaint

claim paper

Read More »

click to vote

AAAI
2007

126views Intelligent Agents» more AAAI 2007»

Point-Based Policy Iteration

13 years 7 months ago

Download www.cs.duke.edu

We describe a point-based policy iteration (PBPI) algorithm for inﬁnite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...

Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...

claim paper

Read More »

click to vote

DAS
2008
Springer

102views Document Analysis» more DAS 2008»

The Convergence of Iterated Classification

13 years 6 months ago

Download www.cse.lehigh.edu

We report an improved methodology for training a sequence of classifiers for document image content extraction, that is, the location and segmentation of regions containing handwr...

Chang An, Henry S. Baird

claim paper

Read More »

click to vote

CDC
2010
IEEE

139views Control Systems» more CDC 2010»

Q-learning and enhanced policy iteration in discounted dynamic programming

12 years 12 months ago

Download web.mit.edu

We consider the classical finite-state discounted Markovian decision problem, and we introduce a new policy iteration-like algorithm for finding the optimal state costs or Q-facto...

Dimitri P. Bertsekas, Huizhen Yu

claim paper

Read More »

« Prev « First page 1 / 8 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers