Learning While Optimizing an Unknown Fitness Surface

13 years 10 months ago

Download www.science.unitn.it

This paper is about Reinforcement Learning (RL) applied to online parameter tuning in Stochastic Local Search (SLS) methods. In particular a novel application of RL is considered in the Reactive Tabu Search (RTS) method, where the appropriate amount of diversiﬁcation in prohibition-based (Tabu) local search is adapted in a fast online manner to the characteristics of a task and of the local conﬁguration. We model the parameter-tuning policy as a Markov Decision Process where the states summarize relevant information about the recent history of the search, and we determine a near-optimal policy by using the Least Squares Policy Iteration (LSPI) method. Preliminary experiments on Maximum Satisﬁability (MAX-SAT) instances show very promising results indicating that the learnt policy is competitive with previously proposed reactive strategies. 1 Reinforcement Learning and Reactive Search Reactive Search (RS) [1–3] advocates the integration of sub-symbolic machine learning technique...

Roberto Battiti, Mauro Brunato, Paolo Campigotto

Real-time Traffic