Search Sciweavers | Sciweavers

55 search results - page 8 / 11

» Approximate Policy Iteration using Large-Margin Classifiers

227

click to vote

ICMLA
2010

211views Machine Learning» more ICMLA 2010»

Ensembles of Neural Networks for Robust Reinforcement Learning

15 years 5 months ago

Download ahans.de

Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their traini...

Alexander Hans, Steffen Udluft

claim paper

Read More »

181

click to vote

ECML
2004
Springer

139views Machine Learning» more ECML 2004»

Batch Reinforcement Learning with State Importance

16 years 25 days ago

Download www.research.rutgers.edu

Abstract. We investigate the problem of using function approximation in reinforcement learning where the agent’s policy is represented as a classiﬁer mapping states to actions....

Lihong Li, Vadim Bulitko, Russell Greiner

claim paper

Read More »

328

click to vote

TON
2010

151views more TON 2010»

Throughput Optimal Distributed Power Control of Stochastic Wireless Networks

15 years 2 months ago

Download pantheon.yale.edu

The Maximum Differential Backlog (MDB) control policy of Tassiulas and Ephremides has been shown to adaptively maximize the stable throughput of multihop wireless networks with ran...

Yufang Xi, Edmund M. Yeh

claim paper

Read More »

173

click to vote

UAI
2004

121views Artificial Intelligence» more UAI 2004»

Discretized Approximations for POMDP with Average Cost

15 years 8 months ago

Download web.mit.edu

In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a fi...

Huizhen Yu, Dimitri P. Bertsekas

claim paper

Read More »

238

click to vote

RSS
2007

176views Robotics» more RSS 2007»

Active Policy Learning for Robot Planning and Exploration under Uncertainty

15 years 8 months ago

Download www.roboticsproceedings.org

Abstract— This paper proposes a simulation-based active policy learning algorithm for ﬁnite-horizon, partially-observed sequential decision processes. The algorithm is tested i...

Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...

claim paper

Read More »

« Prev « First page 8 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers