Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...
Most models of utility elicitation in decision support and interactive optimization assume a predefined set of "catalog" features over which user preferences are express...
We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a mo...
Research in learning and planning in real-time strategy (RTS) games is very interesting in several industries such as military industry, robotics, and most importantly game industr...
Ibrahim Fathy, Mostafa Aref, Omar Enayet, Abdelrah...
It is well known that there cannot be a single "best" heuristic for optimal planning in general. One way of overcoming this is by combining admissible heuristics (e.g. b...