We consider the asymmetric traveling salesperson problem with -parameterized triangle inequality for [1/2, 1). That means, the edge weights fulfill w(u, v)
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...