Search Sciweavers | Sciweavers

102 search results - page 6 / 21

» MDPs with Non-Deterministic Policies

134

Voted

ATAL
2007
Springer

141views Intelligent Agents» more ATAL 2007»

Commitment-driven distributed joint policy search

15 years 8 months ago

Download www-personal.umich.edu

Decentralized MDPs provide powerful models of interactions in multi-agent environments, but are often very diﬃcult or even computationally infeasible to solve optimally. Here we...

Stefan J. Witwicki, Edmund H. Durfee

claim paper

Read More »

click to vote

EXACT
2008

128views Applied Computing» more EXACT 2008»

Explaining recommendations generated by MDPs

15 years 4 months ago

Download www.cs.uwaterloo.ca

There has been little work in explaining recommendations generated by Markov Decision Processes (MDPs). We analyze the difculty of explaining policies computed automatically and id...

Omar Zia Khan, Pascal Poupart, James P. Black

claim paper

Read More »

125

click to vote

ICMLA
2009

185views Machine Learning» more ICMLA 2009»

Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs

14 years 11 months ago

Download staff.science.uva.nl

Abstract--Feature selection is an important challenge in machine learning. Unfortunately, most methods for automating feature selection are designed for supervised learning tasks a...

Mark Kroon, Shimon Whiteson

claim paper

Read More »

click to vote

ICML
2008
IEEE

147views Machine Learning» more ICML 2008»

Apprenticeship learning using linear programming

16 years 2 months ago

Download www.cs.ualberta.ca

In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...

Umar Syed, Michael H. Bowling, Robert E. Schapire

claim paper

Read More »

108

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

15 years 3 months ago

Download books.nips.cc

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...

Ambuj Tewari, Peter L. Bartlett

claim paper

Read More »

« Prev « First page 6 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers