Sciweavers

162 search results - page 24 / 33
» Topological Value Iteration Algorithm for Markov Decision Pr...
Sort
View
ATAL
2009
Springer
15 years 4 months ago
Transfer via soft homomorphisms
The field of transfer learning aims to speed up learning across multiple related tasks by transferring knowledge between source and target tasks. Past work has shown that when th...
Jonathan Sorg, Satinder Singh
TCOM
2011
130views more  TCOM 2011»
14 years 4 months ago
Indirect Reciprocity Game Modelling for Cooperation Stimulation in Cognitive Networks
—In cognitive networks, since nodes generally belong to different authorities and pursue different goals, they will not cooperate with others unless cooperation can improve their...
Yan Chen, K. J. Ray Liu
CAV
2010
Springer
190views Hardware» more  CAV 2010»
15 years 1 months ago
Measuring and Synthesizing Systems in Probabilistic Environments
Often one has a preference order among the different systems that satisfy a given specification. Under a probabilistic assumption about the possible inputs, such a preference order...
Krishnendu Chatterjee, Thomas A. Henzinger, Barbar...
ICML
2009
IEEE
15 years 10 months ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
NIPS
2001
14 years 11 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...