Concurrent PAC RL

8 years 1 months ago

Download www.cs.cmu.edu

In many real-world situations a decision maker may make decisions across many separate reinforcement learning tasks in parallel, yet there has been very little work on concurrent RL. Building on the efﬁcient exploration RL literature, we introduce two new concurrent RL algorithms and bound their sample complexity. We show that under some mild conditions, both when the agent is known to be acting in many copies of the same MDP, and when they are not the same but are taken from a ﬁnite set, we can gain linear improvements in the sample complexity over not sharing information. This is quite exciting as a linear speedup is the most one might hope to gain. Our preliminary experiments conﬁrm this result and show empirical beneﬁts. The ability to share information across tasks to speed learning is a critical aspect of intelligence, and an important goal for autonomous agents. These tasks may themselves involve a sequence of stochastic decisions: consider an online store interacting w...

Zhaohan Guo, Emma Brunskill

Real-time Traffic