Abstract. In order to establish autonomous behavior for technical systems, the well known trade-off between reactive control and deliberative planning has to be considered. Within ...
We explore dynamic shaping to integrate our prior beliefs of the final policy into a conventional reinforcement learning system. Shaping provides a positive or negative artificial...
Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...
Optimization of performance in collective systems often requires altruism. The emergence and stabilization of altruistic behaviors are dicult to achieve because the agents incur ...
In this paper, we describe how certain aspects of the biological phenomena of stigmergy can be imported into multiagent reinforcement learning (MARL), with the purpose of better e...