We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Adaptive predictive search (APS), is a learning system framework, which given little initial domain knowledge, increases its decision-making abilities in complex problems domains....
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
A fast-growing body of research in the AI and machine learning communities addresses learning in games, where there are multiple learners with different interests. This research a...
This paper outlines and discusses the pedagogical approach, the technical design architecture, and an innovative implementation of a collaborative role-play simulation technology ...