Sciweavers

651 search results - page 74 / 131
» Algorithms for Inverse Reinforcement Learning
Sort
View
ESANN
2003
14 years 11 months ago
Improving iterative repair strategies for scheduling with the SVM
The resource constraint project scheduling problem (RCPSP) is an NP-hard benchmark problem in scheduling which takes into account the limitation of resources’ availabilities in ...
Kai Gersmann, Barbara Hammer
ML
2008
ACM
152views Machine Learning» more  ML 2008»
14 years 9 months ago
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...
András Antos, Csaba Szepesvári, R&ea...
ICMLA
2010
14 years 7 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
ICIP
2010
IEEE
14 years 7 months ago
Stochastic gradient descent for robust inverse photomask synthesis in optical lithography
Optical lithography is a critical step in the semiconductor manufacturing process, and one key problem is the design of the photomask for a particular circuit pattern, given the o...
Ningning Jia, Edmund Y. Lam
ATAL
2008
Springer
14 years 11 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...