This paper describes an algorithm, called CQ-learning, which learns to adapt the state representation for multi-agent systems in order to coordinate with other agents. We propose ...
— While underactuated robotic systems are capable of energy efficient and rapid dynamic behavior, we still do not fully understand how body dynamics can be actively used for ada...
We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperati...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli
Abstract. Feature selection in reinforcement learning (RL), i.e. choosing basis functions such that useful approximations of the unkown value function can be obtained, is one of th...
—We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooper...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli