Search Sciweavers | Sciweavers

18 search results - page 2 / 4

» Non-Deterministic Policies in Markovian Decision Processes

click to vote

ICML
1994
IEEE

151views Machine Learning» more ICML 1994»

Learning Without State-Estimation in Partially Observable Markovian Decision Processes

13 years 9 months ago

Download www.eecs.umich.edu

Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning control architectures for embedded agents. Unfortunately all of the theory and much ...

Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...

claim paper

Read More »

click to vote

ICML
2003
IEEE

165views Machine Learning» more ICML 2003»

The Cross Entropy Method for Fast Policy Search

14 years 6 months ago

Download www.hpl.hp.com

We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization al...

Shie Mannor, Reuven Y. Rubinstein, Yohai Gat

claim paper

Read More »

click to vote

AAAI
1996

119views Intelligent Agents» more AAAI 1996»

Rewarding Behaviors

13 years 7 months ago

Download www.cs.toronto.edu

Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solu...

Fahiem Bacchus, Craig Boutilier, Adam J. Grove

claim paper

Read More »

click to vote

JSAC
2011

82views more JSAC 2011»

Optimal Cognitive Access of Markovian Channels under Tight Collision Constraints

13 years 21 days ago

Download acsp.ece.cornell.edu

Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuous-...

Xin Li, Qianchuan Zhao, Xiaohong Guan, Lang Tong

claim paper

Read More »

click to vote

FOCS
2007
IEEE

157views Theoretical Computer Science» more FOCS 2007»

Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

14 years 2 days ago

Download www.cis.upenn.edu

We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...

Sudipto Guha, Kamesh Munagala

claim paper

Read More »

« Prev « First page 2 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers