Optimal cross-layer wireless control policies using TD learning

12 years 11 months ago

Download www.stanford.edu

We present an on-line crosslayer control technique to characterize and approximate optimal policies for wireless networks. Our approach combines network utility maximization and adaptive modulation over an infinite discrete-time horizon using a class of performance measures we call time smoothed utility functions. We model the system as an averagecost Markov decision problem. Model approximations are used to find suitable basis functions for application of least squares TD-learning techniques. The approach yields network control policies that learn the underlying characteristics of the random wireless channel and that approximately optimize network performance. Acknowledgment Financial support from the National Science Foundation under CCF-0729031 and ITMANET DARPA RK 2006-07284 is gratefully acknowledged. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF or DARPA.

Sean P. Meyn, Wei Chen, Daniel O'Neill

Real-time Traffic