A notable features of many proposed Wireless Sensor Networks (WSNs) deployments is their scale: hundreds to thousands of nodes linked together. In such systems, modeling the state...
Probabilistic B (pB) [2, 8] extends classical B [7] to incorporate probabilistic updates together with the specification of quantitative safety properties. As for classical B, prob...
We consider Markov Decision Processes (MDPs) as transformers on probability distributions, where with respect to a scheduler that resolves nondeterminism, the MDP can be seen as ex...
Vijay Anand Korthikanti, Mahesh Viswanathan, Gul A...
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...