Abstract. This paper proposes an algorithm for combinatorial optimizations that uses reinforcement learning and estimation of joint probability distribution of promising solutions ...
The dominant existing routing strategies employed in peerto-peer(P2P) based information retrieval(IR) systems are similarity-based approaches. In these approaches, agents depend o...
We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
Standard Reinforcement Learning (RL) aims to optimize decision-making rules in terms of the expected return. However, especially for risk-management purposes, other criteria such ...
Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...