Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
This paper uses partially observable Markov decision processes (POMDP’s) as a basic framework for MultiAgent planning. We distinguish three perspectives: first one is that of a...
Bharaneedharan Rathnasabapathy, Piotr J. Gmytrasie...
Rapidly-spreading malicious software is an important threat on today’s computer networks. Most solutions that have been proposed to counter this threat are based on our ability ...
—We propose a steepest descent method to compute optimal control parameters for balancing between multiple performance objectives in stateless stochastic scheduling, wherein the ...
Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip, Na...
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...