We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual...
This paper examines a number of solution methods for decision processes with non-Markovian rewards (NMRDPs). They all exploit a temporal logic specification of the reward functio...
The first part of the paper develops a novel, sortally-based approach to the problem of aspectual composition. The account is argued to be superior on both empirical and computati...
While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
A modest exception allowing inheritance reasoner is presented. The reasoner allows restricted, but semantically well founded, defeasible property inheritance. Furthermore, it give...