Sciweavers

2005 search results - page 190 / 401
» Decisive Markov Chains
Sort
View
ATAL
2009
Springer
15 years 8 months ago
Online exploration in least-squares policy iteration
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
STACS
1997
Springer
15 years 5 months ago
Methods and Applications of (MAX, +) Linear Algebra
Exotic semirings such as the “(max, +) semiring” (R ∪ {−∞}, max, +), or the “tropical semiring” (N ∪ {+∞}, min, +), have been invented and reinvented many times s...
Stephane Gaubert, Max Plus
ATAL
2007
Springer
15 years 5 months ago
Interactive dynamic influence diagrams
This paper extends the framework of dynamic influence diagrams (DIDs) to the multi-agent setting. DIDs are computational representations of the Partially Observable Markov Decisio...
Kyle Polich, Piotr J. Gmytrasiewicz
AI
2006
Springer
15 years 5 months ago
An Efficient Resource Allocation Approach in Real-Time Stochastic Environment
We are interested in contributing to solving effectively a particular type of real-time stochastic resource allocation problem. Firstly, one distinction is that certain tasks may c...
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak...
CORR
2011
Springer
175views Education» more  CORR 2011»
14 years 8 months ago
Adaptive Channel Recommendation for Dynamic Spectrum Access
—We propose a dynamic spectrum access scheme where secondary users recommend “good” channels to each other and access accordingly. We formulate the problem as an average rewa...
Xu Chen, Jianwei Huang, Husheng Li