Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
A situation where training and test samples follow different input distributions is called covariate shift. Under covariate shift, standard learning methods such as maximum likeli...
For dynamic sales dialogs in electronic commerce scenarios, approaches based on an information gain measure used for attribute selection have been suggested. These measures conside...
This paper proposes a new planning architecture for agents operating in uncertain and dynamic environments. Decisiontheoretic planning has been recognized as a useful tool for rea...
Designing a collaborative architecture for real-time applications is an intricate challenge that usually involves dealing with the real-time constraints, resource limitations and ...
Dewan Tanvir Ahmed, Shervin Shirmohammadi, Abdulmo...