Sciweavers

TOMACS
2010
79views more  TOMACS 2010»
12 years 11 months ago
A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm
In this paper, we develop a stochastic approximation method to solve a monotone estimation problem and use this method to enhance the empirical performance of the Q-learning algor...
Sumit Kunnumkal, Huseyin Topaloglu
JMLR
2010
135views more  JMLR 2010»
12 years 11 months ago
Finite-sample Analysis of Bellman Residual Minimization
We consider the Bellman residual minimization approach for solving discounted Markov decision problems, where we assume that a generative model of the dynamics and rewards is avai...
Odalric-Ambrym Maillard, Rémi Munos, Alessa...
ICMLA
2009
13 years 2 months ago
Multiagent Transfer Learning via Assignment-Based Decomposition
We describe a system that successfully transfers value function knowledge across multiple subdomains of realtime strategy games in the context of multiagent reinforcement learning....
Scott Proper, Prasad Tadepalli
ICRA
2010
IEEE
101views Robotics» more  ICRA 2010»
13 years 3 months ago
Multirobot coordination by auctioning POMDPs
— We consider the problem of task assignment and execution in multirobot systems, by proposing a procedure for bid estimation in auction protocols. Auctions are of interest to mu...
Matthijs T. J. Spaan, Nelson Gonçalves, Jo&...
MP
2006
105views more  MP 2006»
13 years 4 months ago
Two-stage integer programs with stochastic right-hand sides: a superadditive dual approach
We consider two-stage pure integer programs with discretely distributed stochastic right-hand sides. We present an equivalent superadditive dual formulation that uses the value fun...
Nan Kong, Andrew J. Schaefer, Brady Hunsaker
AI
2008
Springer
13 years 4 months ago
Graphically structured value-function compilation
Classical work on eliciting and representing preferences over multi-attribute alternatives has attempted to recognize conditions under which value functions take on particularly s...
Ronen I. Brafman, Carmel Domshlak
ICML
2010
IEEE
13 years 5 months ago
Inverse Optimal Control with Linearly-Solvable MDPs
We present new algorithms for inverse optimal control (or inverse reinforcement learning, IRL) within the framework of linearlysolvable MDPs (LMDPs). Unlike most prior IRL algorit...
Dvijotham Krishnamurthy, Emanuel Todorov
NIPS
2000
13 years 5 months ago
APRICODD: Approximate Policy Construction Using Decision Diagrams
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and p...
Robert St-Aubin, Jesse Hoey, Craig Boutilier
DAGSTUHL
2008
13 years 6 months ago
Interactive Multiobjective Optimization Using a Set of Additive Value Functions
Abstract. In this chapter, we present a new interactive procedure for multiobjective optimization, which is based on the use of a set of value functions as a preference model built...
José Rui Figueira, Salvatore Greco, Vincent...
GECCO
2010
Springer
153views Optimization» more  GECCO 2010»
13 years 7 months ago
Multi-task evolutionary shaping without pre-specified representations
Shaping functions can be used in multi-task reinforcement learning (RL) to incorporate knowledge from previously experienced tasks to speed up learning on a new task. So far, rese...
Matthijs Snel, Shimon Whiteson