Basically, instrumental conditioning is learning through consequences: Behavior that produces positive results (high “instrumental response”) is reinforced, and that which pro...
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-a...
Rajbala Makar, Sridhar Mahadevan, Mohammad Ghavamz...
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
—The purpose of this paper is to present a comparison between two methods of building adaptive controllers for robots. In spite of the wide range of techniques which are used for...
Sergiu Goschin, Eduard Franti, Monica Dascalu, San...
In this paper, we describe how certain aspects of the biological phenomena of stigmergy can be imported into multiagent reinforcement learning (MARL), with the purpose of better e...