Discounted deterministic Markov decision processes and discounted all-pairs shortest paths

14 years 11 months ago

Download omadani.net

We present two new algorithms for ﬁnding optimal strategies for discounted, inﬁnite-horizon, Deterministic Markov Decision Processes (DMDP). The ﬁrst one is an adaptation of an algorithm of Young, Tarjan and Orlin for ﬁnding minimum mean weight cycles. It runs in O(mn + n2 log n) time, where n is the number of vertices (or states) and m is the number of edges (or actions). The second one is an adaptation of a classical algorithm of Karp for ﬁnding minimum mean weight cycles. It runs in O(mn) time. The ﬁrst algorithm has a slightly slower worst-case complexity, but is faster than the ﬁrst algorithm in many situations. Both algorithms improve on a recent O(mn2 )-time algorithm of Andersson and Vorobyov. We also present a randomized ˜O(m1/2 n2 )-time algorithm for ﬁnding Discounted All-Pairs Shortest Paths (DAPSP), improving several previous algorithms.

Omid Madani, Mikkel Thorup, Uri Zwick

Real-time Traffic