We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g. in the game of go [6]. Their efficient exploration of the tree enables to ret...
This article presents an implemented multi-robot system for playing the popular game of laser tag. The object of the game is to search for and tag opponents that can move freely a...
Matthew Rosencrantz, Geoffrey J. Gordon, Sebastian...
A token is hidden in one of several boxes and then the boxes are locked. The probability of placing the token in each of the boxes is known. A searcher is looking for the token by...
Amotz Bar-Noy, Panagiotis Cheilaris, Yi Feng 0002,...
— The Smith-Waterman algorithm is a dynamic programming method for determining optimal local alignments between nucleotide or protein sequences. However, it suffers from quadrati...