Search Sciweavers | Sciweavers

771 search results - page 35 / 155

» Markov Decision Processes with Arbitrary Reward Processes

154

click to vote

ICML
2009
IEEE

109views Machine Learning» more ICML 2009»

Piecewise-stationary bandit problems with side observations

16 years 6 months ago

Download www.cim.mcgill.ca

We consider a sequential decision problem where the rewards are generated by a piecewise-stationary distribution. However, the different reward distributions are unknown and may c...

Jia Yuan Yu, Shie Mannor

claim paper

Read More »

175

click to vote

NIPS
2001

131views Information Technology» more NIPS 2001»

The Steering Approach for Multi-Criteria Reinforcement Learning

15 years 7 months ago

Download books.nips.cc

We consider the problem of learning to attain multiple goals in a dynamic environment, which is initially unknown. In addition, the environment may contain arbitrarily varying ele...

Shie Mannor, Nahum Shimkin

claim paper

Read More »

176

click to vote

AAAI
2007

117views Intelligent Agents» more AAAI 2007»

Authorial Idioms for Target Distributions in TTD-MDPs

15 years 7 months ago

Download www.cc.gatech.edu

In designing Markov Decision Processes (MDP), one must deﬁne the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there i...

David L. Roberts, Sooraj Bhat, Kenneth St. Clair, ...

claim paper

Read More »

147

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

15 years 7 months ago

Download books.nips.cc

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...

Ambuj Tewari, Peter L. Bartlett

claim paper

Read More »

141

click to vote

WSC
2008

154views Modeling And Simulation» more WSC 2008»

On step sizes, stochastic shortest paths, and survival probabilities in Reinforcement Learning

15 years 7 months ago

Download www.informs-sim.org

Reinforcement Learning (RL) is a simulation-based technique useful in solving Markov decision processes if their transition probabilities are not easily obtainable or if the probl...

Abhijit Gosavi

claim paper

Read More »

« Prev « First page 35 / 155 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers