Search Sciweavers | Sciweavers

233 search results - page 3 / 47

» Composing and combining policies under the policy machine

click to vote

ETRICS
2006

165views Security Privacy» more ETRICS 2006»

An Algebra for Enterprise Privacy Policies Closed Under Composition and Conjunction

13 years 9 months ago

Download www.crypto.ethz.ch

To cope with the complex requirements imposed on the processing of privacy-sensitive data within enterprises, the use of automatic or semi-automatic tools is gradually becoming in...

Dominik Raub, Rainer Steinwandt

claim paper

Read More »

click to vote

ICML
2008
IEEE

147views Machine Learning» more ICML 2008»

Apprenticeship learning using linear programming

14 years 6 months ago

Download www.cs.ualberta.ca

In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...

Umar Syed, Michael H. Bowling, Robert E. Schapire

claim paper

Read More »

click to vote

AIPS
2004

145views Artificial Intelligence» more AIPS 2004»

Optimal Resource Allocation and Policy Formulation in Loosely-Coupled Markov Decision Processes

13 years 6 months ago

Download www.aaai.org

The problem of optimal policy formulation for teams of resource-limited agents in stochastic environments is composed of two strongly-coupled subproblems: a resource allocation pr...

Dmitri A. Dolgov, Edmund H. Durfee

claim paper

Read More »

click to vote

ICML
2002
IEEE

113views Machine Learning» more ICML 2002»

Learning from Scarce Experience

14 years 6 months ago

Download www.cs.ucr.edu

Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each ...

Leonid Peshkin, Christian R. Shelton

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 6 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 3 / 47 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers