While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
In this work, a Modified Vector Field Histogram (MVFH) has been developed to improve path planning and obstacle avoidance for a wheeled driven mobile robot. It permits the detecti...
Procedural representations of control policies have two advantages when facing the scale-up problem in learning tasks. First they are implicit, with potential for inductive genera...
The paper presents LOCO-Analyst, an educational tool for providing teachers with feedback on the relevant aspects of the learning process taking place in a web-based learning envir...
Jelena Jovanovic, Dragan Gasevic, Christopher A. B...
We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-i...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...