While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
Abstract—Coping with outliers contaminating dynamical processes is of major importance in various applications because mismatches from nominal models are not uncommon in practice...
Shahrokh Farahmand, Georgios B. Giannakis, Daniele...
In this paper, we propose and analyze a methodology for providing statistical guarantees within the diffserv model in a network, that uses static-priority schedulers. We extend th...
The aim of this paper is to investigate how different smoothing parameter levels of the Automatic Pipeline Inventory and Order Based Production Control System smoothing replenishm...
— This paper describes 3D biped walking generation and control based on Limit Cycle Walking. In our study, we use the simplest possible 3D biped model with three DOFs, incorporat...
Kentaro Miyahara, Yuzuru Harada, Dragomir N. Nench...