Sciweavers

AAAI
2012
11 years 11 months ago
MOMDPs: A Solution for Modelling Adaptive Management Problems
In conservation biology and natural resource management, adaptive management is an iterative process of improving management by reducing uncertainty via monitoring. Adaptive manag...
Iadine Chades, Josie Carwardine, Tara G. Martin, S...
ATAL
2011
Springer
12 years 9 months ago
Maximum causal entropy correlated equilibria for Markov games
Motivated by a machine learning perspective—that gametheoretic equilibria constraints should serve as guidelines for predicting agents’ strategies, we introduce maximum causal...
Brian D. Ziebart, J. Andrew Bagnell, Anind K. Dey
CORR
2010
Springer
105views Education» more  CORR 2010»
13 years 8 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
TIT
2008
110views more  TIT 2008»
13 years 9 months ago
Optimal Cross-Layer Scheduling of Transmissions Over a Fading Multiaccess Channel
We consider the problem of several users transmitting packets to a base station, and study an optimal scheduling formulation involving three communication layers, namely, the mediu...
Munish Goyal, Anurag Kumar, Vinod Sharma
UAI
2000
13 years 10 months ago
Fast Planning in Stochastic Games
Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...
Michael J. Kearns, Yishay Mansour, Satinder P. Sin...
UAI
2004
13 years 10 months ago
Heuristic Search Value Iteration for POMDPs
We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret w...
Trey Smith, Reid G. Simmons
IJCAI
2003
13 years 10 months ago
Point-based value iteration: An anytime algorithm for POMDPs
This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of represen...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
AIPS
2008
13 years 11 months ago
Bounded-Parameter Partially Observable Markov Decision Processes
The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-...
Yaodong Ni, Zhi-Qiang Liu