Sciweavers

1997 search results - page 223 / 400
» On the convergence of Hill's method
Sort
View
AAAI
2012
13 years 3 months ago
Kernel-Based Reinforcement Learning on Representative States
Markov decision processes (MDPs) are an established framework for solving sequential decision-making problems under uncertainty. In this work, we propose a new method for batchmod...
Branislav Kveton, Georgios Theocharous
ICML
1998
IEEE
16 years 2 months ago
Intra-Option Learning about Temporally Abstract Actions
tion Learning about Temporally Abstract Actions Richard S. Sutton Department of Computer Science University of Massachusetts Amherst, MA 01003-4610 rich@cs.umass.edu Doina Precup D...
Richard S. Sutton, Doina Precup, Satinder P. Singh
ECCV
2000
Springer
16 years 3 months ago
Coupled Geodesic Active Regions for Image Segmentation: A Level Set Approach
Abstract. This paper presents anovel variational method forimage segmentation that uni es boundary and region-based information sources under the Geodesic Active Region framework. ...
Nikos Paragios, Rachid Deriche
ICML
2001
IEEE
16 years 2 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
ICDCS
2010
IEEE
15 years 5 months ago
Stochastic Steepest-Descent Optimization of Multiple-Objective Mobile Sensor Coverage
—We propose a steepest descent method to compute optimal control parameters for balancing between multiple performance objectives in stateless stochastic scheduling, wherein the ...
Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip, Na...