Search Sciweavers | Sciweavers

1997 search results - page 223 / 400

» On the convergence of Hill's method

136

click to vote

AAAI
2012

205views Intelligent Agents» more AAAI 2012»

Kernel-Based Reinforcement Learning on Representative States

13 years 3 months ago

Download www.bkveton.com

Markov decision processes (MDPs) are an established framework for solving sequential decision-making problems under uncertainty. In this work, we propose a new method for batchmod...

Branislav Kveton, Georgios Theocharous

claim paper

Read More »

107

click to vote

ICML
1998
IEEE

165views Machine Learning» more ICML 1998»

Intra-Option Learning about Temporally Abstract Actions

16 years 2 months ago

Download www.cs.ualberta.ca

tion Learning about Temporally Abstract Actions Richard S. Sutton Department of Computer Science University of Massachusetts Amherst, MA 01003-4610 rich@cs.umass.edu Doina Precup D...

Richard S. Sutton, Doina Precup, Satinder P. Singh

claim paper

Read More »

129

click to vote

ECCV
2000
Springer

241views Computer Vision» more ECCV 2000»

Coupled Geodesic Active Regions for Image Segmentation: A Level Set Approach

16 years 3 months ago

Download vision.mas.ecp.fr

Abstract. This paper presents anovel variational method forimage segmentation that uni es boundary and region-based information sources under the Geodesic Active Region framework. ...

Nikos Paragios, Rachid Deriche

claim paper

Read More »

126

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 2 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

134

click to vote

ICDCS
2010
IEEE

167views Distributed And Parallel Com...» more ICDCS 2010»

Stochastic Steepest-Descent Optimization of Multiple-Objective Mobile Sensor Coverage

15 years 5 months ago

Download www.cs.purdue.edu

—We propose a steepest descent method to compute optimal control parameters for balancing between multiple performance objectives in stateless stochastic scheduling, wherein the ...

Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip, Na...

claim paper

Read More »

« Prev « First page 223 / 400 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers