We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and...
Ashish Venugopal, Jakob Uszkoreit, David Talbot, F...
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begi...
Abstract. This paper extends the N-person IPD game into a more interesting game in economics, namely, the oligopoly game. Due to its market share dynamics, the oligopoly game is mo...
Abstract In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This pa...
TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...