This paper investigates how to automatically create a dialogue control component of a listening agent to reduce the current high cost of manually creating such components. We coll...
Toyomi Meguro, Ryuichiro Higashinaka, Yasuhiro Min...
Abstract—We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each ar...
SMALLbox is a new foundational framework for processing signals, using adaptive sparse structured representations. The main aim of SMALLbox is to become a test ground for explorati...
Ivan Damnjanovic, Matthew E. P. Davies, Mark D. Pl...
Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...