Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a rob...
H. Brendan McMahan, Geoffrey J. Gordon, Avrim Blum
In [1], Murata et al introduced an elegant representation of block placement called sequence pair. All block placement algorithms which are based on sequence pairs use simulated a...
Most graph layout algorithms treat nodes as points. The problem of node overlap removal is to adjust the layout generated by such methods so that nodes of non-zero width and height...
This paper presents a real-time single-camera surveillance system, aiming at detecting and partly analyzing a group of people. A set of moving persons is segmented using a combina...