Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...
A selective acknowledgment (SACK) mechanism, combined with a selective repeat retransmission policy, has been proposed to overcome the limitations with the cumulative acknowledgme...
Estimating insurance premia from data is a difficult regression problem for several reasons: the large number of variables, many of which are discrete, and the very peculiar shape...
Nicolas Chapados, Yoshua Bengio, Pascal Vincent, J...
Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solu...
Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be coste ective to take sequences of actions in open-loop m...
Eric A. Hansen, Andrew G. Barto, Shlomo Zilberstei...