We study functions with multiple output values, and use active sampling to identify an example for each of the possible output values. Our results for this setting include: (1) Eff...
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...
: Wireless control systems (WCSs) often have to operate in dynamic environments where the network traffic load may vary unpredictably over time. The sampling in sensors is conventi...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...