We introduce and analyze a new algorithm for linear classification which combines Rosenblatt's perceptron algorithm with Helmbold and Warmuth's leave-one-out method. Like...
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
Abstract. We provide a framework for distributed systems that impose timing constraints on their executions. We propose a timed model of communicating finite-state machines, which...
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3 ) to directly solve the Bellman system of |S...
Mainstream object-oriented languages, such as C++ and Java1 , provide only a restricted form of polymorphic methods, namely uni-receiver dispatch. In common programming situations...
Christopher Dutchyn, Paul Lu, Duane Szafron, Steve...