Current algorithms for the standard multi-armed bandit problem have good empirical performance and optimal regret bounds. However, real-world problems often differ from the standa...
Travis Mandel, Yun-En Liu, Emma Brunskill, Zoran P...
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In thi...
We consider the problem of learning deep representation when target labels are available. In this paper, we show that there exists intrinsic relationship between target coding and...
Shuo Yang, Ping Luo, Chen Change Loy, Kenneth W. S...
In many real-world situations a decision maker may make decisions across many separate reinforcement learning tasks in parallel, yet there has been very little work on concurrent ...