One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
Exotic semirings such as the “(max, +) semiring” (R ∪ {−∞}, max, +), or the “tropical semiring” (N ∪ {+∞}, min, +), have been invented and reinvented many times s...
In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observat...
We are interested in building decision-support software for social welfare case managers. Our model in the form of a factored Markov decision process is so complex that a standard...
Partially Observable Markov Decision Processes (POMDPs) are a well-established and rigorous framework for sequential decision-making under uncertainty. POMDPs are well-known to be...