Learning capabilities of computer systems still lag far behind biological systems. One of the reasons can be seen in the inefficient re-use of control knowledge acquired over the...
Partially-observable Markov decision processes (POMDPs) provide a powerful model for sequential decision-making problems with partially-observed state and are known to have (appro...
Planning graphs have been shown to be a rich source of heuristic information for many kinds of planners. In many cases, planners must compute a planning graph for each element of ...
Daniel Bryce, William Cushing, Subbarao Kambhampat...
Abstract--Fingerprinting operators generate functional signatures of game players and are useful for their automated analysis independent of representation or encoding. The theory ...
Abstract. Reinforcement learning (RL) is a widely used learning paradigm for adaptive agents. There exist several convergent and consistent RL algorithms which have been intensivel...
Lucian Busoniu, Damien Ernst, Bart De Schutter, Ro...