ion from the implementation they use. Some tools rather consequently use this abstract model; others at least partly keep the implementational view, modeling components, e.g., by o...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
We explore techniques for maintaining consistency in reasoning when employing dynamic hierarchical task decompositions. In particular, we consider the difficulty of maintaining co...
In this paper, we discuss two major tradeoffs, spatial and temporal tradeoffs, that appear when applying marketbased computing to multimedia network applications. The former appea...
Links are established between three widely used modeling frameworks for reactive systems: the ioco theory of Tretmans, the interface automata of De Alfaro and Henzinger, and Mealy ...