Search Sciweavers | Sciweavers

656 search results - page 105 / 132

» Complexity of finite-horizon Markov decision process problem...

134

Voted

AAAI
2007

117views Intelligent Agents» more AAAI 2007»

Authorial Idioms for Target Distributions in TTD-MDPs

15 years 4 months ago

Download www.cc.gatech.edu

In designing Markov Decision Processes (MDP), one must deﬁne the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there i...

David L. Roberts, Sooraj Bhat, Kenneth St. Clair, ...

claim paper

Read More »

click to vote

ATAL
2008
Springer

104views Intelligent Agents» more ATAL 2008»

Expediting RL by using graphical structures

15 years 3 months ago

Download www.cs.washington.edu

The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend ...

Peng Dai, Alexander L. Strehl, Judy Goldsmith

claim paper

Read More »

113

click to vote

IJCAI
2003

137views Artificial Intelligence» more IJCAI 2003»

Approximating Optimal Policies for Agents with Limited Execution Resources

15 years 3 months ago

Download ai.stanford.edu

An agent with limited consumable execution resources needs policies that attempt to achieve good performance while respecting these limitations. Otherwise, an agent (such as a pla...

Dmitri A. Dolgov, Edmund H. Durfee

claim paper

Read More »

108

click to vote

IJCAI
2003

111views Artificial Intelligence» more IJCAI 2003»

Generalizing Plans to New Environments in Relational MDPs

15 years 3 months ago

Download select.cs.cmu.edu

A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning....

Carlos Guestrin, Daphne Koller, Chris Gearhart, Ne...

claim paper

Read More »

click to vote

NIPS
2003

128views Information Technology» more NIPS 2003»

Distributed Optimization in Adaptive Networks

15 years 3 months ago

Download books.nips.cc

We develop a protocol for optimizing dynamic behavior of a network of simple electronic components, such as a sensor network, an ad hoc network of mobile devices, or a network of ...

Ciamac Cyrus Moallemi, Benjamin Van Roy

claim paper

Read More »

« Prev « First page 105 / 132 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers