A k-NN Based Perception Scheme for Reinforcement Learning

15 years 10 months ago

Download www.dia.fi.upm.es

Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online” interaction with the environment. In this sense this is the key diﬀerence from supervised machine learning in which the learner is told what actions to take. Instead of that, in RL the agent (learner) acts autonomously and only receives a scalar reward signal that is used for evaluate what so good is the actual behavioral policy. The framework of RL is designed to guide the learner in maximizing the average reward in the long run. One of the consequences of this learning paradigm is that the agent must explore new behavioral policies because there is no supervisor that tell what actions to do, thus, the trade oﬀ between exploration and exploitation is a key characteristic of RL. Typically, exploration procedures selects actions following a random distribution in order to gain more knowledge of the env...

José Antonio Martin H., Javier de Lope Asia

Real-time Traffic