Sciweavers

18 search results - page 2 / 4
» Non-Deterministic Policies in Markovian Decision Processes
Sort
View
ICML
1994
IEEE
13 years 9 months ago
Learning Without State-Estimation in Partially Observable Markovian Decision Processes
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning control architectures for embedded agents. Unfortunately all of the theory and much ...
Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...
ICML
2003
IEEE
14 years 6 months ago
The Cross Entropy Method for Fast Policy Search
We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization al...
Shie Mannor, Reuven Y. Rubinstein, Yohai Gat
AAAI
1996
13 years 7 months ago
Rewarding Behaviors
Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solu...
Fahiem Bacchus, Craig Boutilier, Adam J. Grove
JSAC
2011
82views more  JSAC 2011»
13 years 21 days ago
Optimal Cognitive Access of Markovian Channels under Tight Collision Constraints
Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuous-...
Xin Li, Qianchuan Zhao, Xiaohong Guan, Lang Tong
FOCS
2007
IEEE
14 years 2 days ago
Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
Sudipto Guha, Kamesh Munagala