Sciweavers

233 search results - page 3 / 47
» Composing and combining policies under the policy machine
Sort
View
ETRICS
2006
13 years 9 months ago
An Algebra for Enterprise Privacy Policies Closed Under Composition and Conjunction
To cope with the complex requirements imposed on the processing of privacy-sensitive data within enterprises, the use of automatic or semi-automatic tools is gradually becoming in...
Dominik Raub, Rainer Steinwandt
ICML
2008
IEEE
14 years 6 months ago
Apprenticeship learning using linear programming
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
Umar Syed, Michael H. Bowling, Robert E. Schapire
AIPS
2004
13 years 6 months ago
Optimal Resource Allocation and Policy Formulation in Loosely-Coupled Markov Decision Processes
The problem of optimal policy formulation for teams of resource-limited agents in stochastic environments is composed of two strongly-coupled subproblems: a resource allocation pr...
Dmitri A. Dolgov, Edmund H. Durfee
ICML
2002
IEEE
14 years 6 months ago
Learning from Scarce Experience
Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each ...
Leonid Peshkin, Christian R. Shelton
ICML
2001
IEEE
14 years 6 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta