A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of ...
We propose a new neural network architecture, called Simple Recurrent Temporal-Difference Networks (SR-TDNs), that learns to predict future observations in partially observable en...