Sciweavers

797 search results - page 69 / 160
» Timed Control with Partial Observability
Sort
View
NIPS
2001
15 years 1 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
118
Voted
MOBIHOC
2009
ACM
16 years 1 months ago
Admission control and scheduling for QoS guarantees for variable-bit-rate applications on wireless channels
Providing differentiated Quality of Service (QoS) over unreliable wireless channels is an important challenge for supporting several future applications. We analyze a model that h...
I-Hong Hou, P. R. Kumar
106
Voted
CORR
2007
Springer
73views Education» more  CORR 2007»
15 years 16 days ago
Universal Reinforcement Learning
—We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence futu...
Vivek F. Farias, Ciamac Cyrus Moallemi, Tsachy Wei...
FSTTCS
2005
Springer
15 years 6 months ago
The MSO Theory of Connectedly Communicating Processes
Abstract. We identify a network of sequential processes that communicate by synchronizing frequently on common actions. More precisely, we demand that there is a bound k such that ...
P. Madhusudan, P. S. Thiagarajan, Shaofa Yang
121
Voted
PLDI
1993
ACM
15 years 4 months ago
Dependence-Based Program Analysis
Program analysis and optimizationcan be speeded upthrough the use of the dependence flow graph (DFG), a representation of program dependences which generalizes def-use chains and...
Richard Johnson, Keshav Pingali