We investigate the runtime of the Binary Particle Swarm Optimization (PSO) algorithm introduced by Kennedy and Eberhart (1997). The Binary PSO maintains a global best solution and...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
We present a dynamic programming approach for the solution of first-order Markov decisions processes. This technique uses an MDP whose dynamics is represented in a variant of the ...
Optimal Brain Damage (OBD) is a method for reducing the number of weights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid app...
Volker Tresp, Ralph Neuneier, Hans-Georg Zimmerman...
We have investigated the potential for using genetic programming to evolve compiler parsing and translation routines for processing arithmetic and logical expressions as they are ...