Table 1 shows the payoff to player one. The same matrix also holds for player two. Player one can gain the maximum 5 points (T = 5) by defection if player two cooperates. However,...
Axelrod’s original experiments for evolving IPD player strategies involved the use of a basic GA. In this paper we examine how well a simple GA performs against the more recent P...
Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to nearoptimal behavior. In this paper, we introduce social reward sha...
Monica Babes, Enrique Munoz de Cote, Michael L. Li...
Fingerprinting is a technique for generating a representation-independent functional signature for a game playing agent. Fingerprints can be used to compare agents across represent...
Abstract— Fingerprinting is a technique that permits automatic classification of strategies for playing a game. In this study the evolution of strategies for playing the iterate...