Abstract: Classification-based reinforcement learning (RL) methods have recently been proposed as an alternative to the traditional value-function based methods. These methods use...
In this paper, we try to demonstrate the capability of a very simple architecture to learn to recognize and reproduce facial expressions without the innate capability to recognize ...
An examination of the contributions that can be made by the field of non-mechanistic cybernetics (as elaborated by Gregory Bateson and Anthony Wilden) to a theory of videogames th...
Predictions oflifetimesofdynamicallyallocated objects can be used to improve time and space e ciency of dynamic memory management in computer programs. Barrett and Zorn 1993] used...
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially gen...