Search Sciweavers | Sciweavers

162 search results - page 24 / 33

» Topological Value Iteration Algorithm for Markov Decision Pr...

180

Voted

ATAL
2009
Springer

146views Intelligent Agents» more ATAL 2009»

Transfer via soft homomorphisms

16 years 19 days ago

Download www.eecs.umich.edu

The ﬁeld of transfer learning aims to speed up learning across multiple related tasks by transferring knowledge between source and target tasks. Past work has shown that when th...

Jonathan Sorg, Satinder Singh

claim paper

Read More »

159

click to vote

TCOM
2011

130views more TCOM 2011»

Indirect Reciprocity Game Modelling for Cooperation Stimulation in Cognitive Networks

15 years 1 months ago

Download sig.umd.edu

—In cognitive networks, since nodes generally belong to different authorities and pursue different goals, they will not cooperate with others unless cooperation can improve their...

Yan Chen, K. J. Ray Liu

claim paper

Read More »

161

click to vote

CAV
2010
Springer

190views Hardware» more CAV 2010»

Measuring and Synthesizing Systems in Probabilistic Environments

15 years 9 months ago

Download www-verimag.imag.fr

Often one has a preference order among the different systems that satisfy a given specification. Under a probabilistic assumption about the possible inputs, such a preference order...

Krishnendu Chatterjee, Thomas A. Henzinger, Barbar...

claim paper

Read More »

154

click to vote

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

16 years 6 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

151

Voted

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 7 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

« Prev « First page 24 / 33 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers