Sciweavers

771 search results - page 12 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
ICC
2008
IEEE
169views Communications» more  ICC 2008»
15 years 8 months ago
Optimality of Myopic Sensing in Multi-Channel Opportunistic Access
—We consider opportunistic communications over multiple channels where the state (“good” or “bad”) of each channel evolves as independent and identically distributed Mark...
Tara Javidi, Bhaskar Krishnamachari, Qing Zhao, Mi...
FOCS
2003
IEEE
15 years 7 months ago
Approximation Algorithms for Orienteering and Discounted-Reward TSP
In this paper, we give the rst constant-factor approximationalgorithmfor the rooted Orienteering problem, as well as a new problem that we call the Discounted-Reward TSP, motivate...
Avrim Blum, Shuchi Chawla, David R. Karger, Terran...
160
Voted
ATAL
2010
Springer
15 years 2 months ago
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents de...
W. Bradley Knox, Peter Stone
CORR
2010
Springer
171views Education» more  CORR 2010»
14 years 8 months ago
Online Learning in Opportunistic Spectrum Access: A Restless Bandit Approach
We consider an opportunistic spectrum access (OSA) problem where the time-varying condition of each channel (e.g., as a result of random fading or certain primary users' activ...
Cem Tekin, Mingyan Liu
ICASSP
2011
IEEE
14 years 5 months ago
Adaptive scalable layer filtering process for video scheduling over wireless networks based on MAC buffer management
In this paper, the problem of scalable video delivery over a timevarying wireless channel is considered. Packet scheduling and buffer management in both Application and Medium Acc...
Nesrine Changuel, Nicholas Mastronarde, Mihaela va...