We consider supervised learning of a ranking function, which is a mapping from instances to total orders over a set of labels (options). The training information consists of exampl...
In this paper, we describe methods for e ciently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boo...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
When labelled training data is plentiful, discriminative techniques are widely used since they give excellent generalization performance. However, for large-scale applications suc...
Julia A. Lasserre, Christopher M. Bishop, Thomas P...
— The list decoding algorithm can outperform the conventional unique decoding algorithm by producing a list of candidate decoded messages. An efficient list decoding algorithm fo...
L. Chen, Rolando A. Carrasco, Martin Johnston, E. ...