We address the problem of learning in repeated N-player (as opposed to 2-player) general-sum games. We describe an extension to existing criteria focusing explicitly on such setti...
We have recently presented CarpeDiem, an algorithm that can be used for speeding up the evaluation of Supervised Sequential Learning (SSL) classifiers. CarpeDiem provides impress...
We address two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating...
We describe an approach for acquiring the domain-specific dialog knowledge required to configure a task-oriented dialog system that uses human-human interaction data. The key aspe...
Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be coste ective to take sequences of actions in open-loop m...
Eric A. Hansen, Andrew G. Barto, Shlomo Zilberstei...