target policy | Sciweavers

220

ICML
2010
IEEE

231views Machine Learning» more ICML 2010»

Toward Off-Policy Learning Control with Function Approximation

15 years 8 months ago

We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...

Hamid Reza Maei, Csaba Szepesvári, Shalabh ...

claim paper

Read More »

221

click to vote

CASSIS
2005
Springer

142views Human Computer Interaction» more CASSIS 2005»

Mobile Resource Guarantees and Policies

16 years 27 days ago

Download homepages.inf.ed.ac.uk

This paper introduces notions of resource policy for mobile code to be run on smart devices, to integrate with the proof-carrying code architecture of the Mobile Resource Guarantee...

David Aspinall, Kenneth MacKenzie

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers