Search Sciweavers | Sciweavers

92

ATAL
2008
Springer

104views Intelligent Agents» more ATAL 2008»

Expediting RL by using graphical structures

15 years 3 months ago

The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend ...

Peng Dai, Alexander L. Strehl, Judy Goldsmith

claim paper

Read More »

88

click to vote

AAAI
2010

163views Intelligent Agents» more AAAI 2010»

Structured Parameter Elicitation

15 years 2 months ago

Download motion.comp.nus.edu.sg

The behavior of a complex system often depends on parameters whose values are unknown in advance. To operate effectively, an autonomous agent must actively gather information on t...

Li Ling Ko, David Hsu, Wee Sun Lee, Sylvie C. W. O...

claim paper

Read More »

114

click to vote

AAAI
2010

185views Intelligent Agents» more AAAI 2010»

Symbolic Dynamic Programming for First-order POMDPs

15 years 2 months ago

Download www-kd.iai.uni-bonn.de

Partially-observable Markov decision processes (POMDPs) provide a powerful model for sequential decision-making problems with partially-observed state and are known to have (appro...

Scott Sanner, Kristian Kersting

claim paper

Read More »

108

click to vote

NIPS
2008

109views Information Technology» more NIPS 2008»

Biasing Approximate Dynamic Programming with a Lower Discount Factor

15 years 2 months ago

Download hal.inria.fr

Most algorithms for solving Markov decision processes rely on a discount factor, which ensures their convergence. It is generally assumed that using an artificially low discount f...

Marek Petrik, Bruno Scherrer

claim paper

Read More »

104

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

15 years 2 months ago

Download books.nips.cc

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...

Ambuj Tewari, Peter L. Bartlett

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers