Sciweavers

102 search results - page 6 / 21
» MDPs with Non-Deterministic Policies
Sort
View
100
Voted
ATAL
2007
Springer
15 years 3 months ago
Commitment-driven distributed joint policy search
Decentralized MDPs provide powerful models of interactions in multi-agent environments, but are often very difficult or even computationally infeasible to solve optimally. Here we...
Stefan J. Witwicki, Edmund H. Durfee
67
Voted
EXACT
2008
14 years 12 months ago
Explaining recommendations generated by MDPs
There has been little work in explaining recommendations generated by Markov Decision Processes (MDPs). We analyze the difculty of explaining policies computed automatically and id...
Omar Zia Khan, Pascal Poupart, James P. Black
ICMLA
2009
14 years 7 months ago
Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs
Abstract--Feature selection is an important challenge in machine learning. Unfortunately, most methods for automating feature selection are designed for supervised learning tasks a...
Mark Kroon, Shimon Whiteson
60
Voted
ICML
2008
IEEE
15 years 10 months ago
Apprenticeship learning using linear programming
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
Umar Syed, Michael H. Bowling, Robert E. Schapire
NIPS
2007
14 years 11 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett