Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...
ate Abstractions of Discrete-Time Controlled Stochastic Hybrid Systems Alessandro D’Innocenzo, Alessandro Abate, and Maria D. Di Benedetto — This work proposes a procedure to c...
Alessandro D'Innocenzo, Alessandro Abate, Maria Do...
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Abstract—This paper considers maximizing throughput utility in a multi-user network with partially observable Markov ON/OFF channels. Instantaneous channel states are never known...
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...