Cultivating desired behaviour: policy teaching via environment-dynamics tweaks

15 years 6 months ago

Download eprints.ecs.soton.ac.uk

In this paper we study, for the first time explicitly, the implications of endowing an interested party (i.e. a teacher) with the ability to modify the underlying dynamics of the environment, in order to encourage an agent to learn to follow a specific policy. We introduce a cost function which can be used by the teacher to balance the modifications it makes to the underlying environment dynamics, with the learner's performance compared to some ideal, desired, policy. We formulate teacher's problem of determining optimal environment changes as a planning and control problem, and empirically validate the effectiveness of our model. Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning; I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search-Control theory; I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence-- Multiagent systems General Terms Algorithms Keywords Teacher-learner, control theory, Kullback-Leibler Rat...

Zinovi Rabinovich, Lachlan Dufton, Kate Larson, Ni

Real-time Traffic

Artificial Intelligence | ATAL 2010 | Distributed Artificial Intelligence | Intelligent Agents | Underlying Environment Dynamics |

claim paper

Post Info
More Details (n/a)

Added	08 Nov 2010
Updated	08 Nov 2010
Type	Conference
Year	2010
Where	ATAL
Authors	Zinovi Rabinovich, Lachlan Dufton, Kate Larson, Nicholas R. Jennings

Comments (0)

Sciweavers

Cultivating desired behaviour: policy teaching via environment-dynamics tweaks

Artificial Intelligence | ATAL 2010 | Distributed Artificial Intelligence | Intelligent Agents | Underlying Environment Dynamics |

Explore & Download

Productivity Tools

Sciweavers