We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...
PAC-MDP algorithms approach the exploration-exploitation problem of reinforcement learning agents in an effective way which guarantees that with high probability, the algorithm pe...
One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement ...
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuris...