Sciweavers

Share
warning: Creating default object from empty value in /var/www/modules/taxonomy/taxonomy.module on line 1416.
AAAI
2012
10 years 4 months ago
Planning in Factored Action Spaces with Symbolic Dynamic Programming
We consider symbolic dynamic programming (SDP) for solving Markov Decision Processes (MDP) with factored state and action spaces, where both states and actions are described by se...
Aswin Raghavan, Saket Joshi, Alan Fern, Prasad Tad...
RAS
2010
131views more  RAS 2010»
12 years 10 days ago
Probabilistic Policy Reuse for inter-task transfer learning
Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration b...
Fernando Fernández, Javier García, M...
ORL
2006
72views more  ORL 2006»
12 years 1 months ago
A note on two-person zero-sum communicating stochastic games
For undiscounted two-person zero-sum communicating stochastic games with finite state and action spaces, a solution procedure is proposed that exploits the communication property,...
Zeynep Müge Avsar, Melike Baykal-Gursoy
AUTOMATICA
2007
82views more  AUTOMATICA 2007»
12 years 2 months ago
Simulation-based optimal sensor scheduling with application to observer trajectory planning
The sensor scheduling problem can be formulated as a controlled hidden Markov model and this paper solves the problem when the state, observation and action spaces are continuous....
Sumeetpal S. Singh, Nikolaos Kantas, Ba-Ngu Vo, Ar...
NIPS
2007
12 years 3 months ago
Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforcement Learning algorithm...
Alessandro Lazaric, Marcello Restelli, Andrea Bona...
NIPS
2008
12 years 3 months ago
Fitted Q-iteration by Advantage Weighted Regression
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sample efficiency, a more stable learning process and the higher quality of the re...
Gerhard Neumann, Jan Peters
SIGECOM
2006
ACM
88views ECommerce» more  SIGECOM 2006»
12 years 8 months ago
Implementation with a bounded action space
While traditional mechanism design typically assumes isomorphism between the agents’ type- and action spaces, in many situations the agents face strict restrictions on their act...
Liad Blumrosen, Michal Feldman
books