Sciweavers

129 search results - page 20 / 26
» Automatic Recovery Using Bounded Partially Observable Markov...
Sort
View
AAAI
2011
14 years 14 days ago
Linear Dynamic Programs for Resource Management
Sustainable resource management in many domains presents large continuous stochastic optimization problems, which can often be modeled as Markov decision processes (MDPs). To solv...
Marek Petrik, Shlomo Zilberstein
CONNECTION
2008
178views more  CONNECTION 2008»
15 years 16 days ago
Spoken language interaction with model uncertainty: an adaptive human-robot interaction system
Spoken language is one of the most intuitive forms of interaction between humans and agents. Unfortunately, agents that interact with people using natural language often experienc...
Finale Doshi, Nicholas Roy
164
Voted
CSL
2012
Springer
13 years 8 months ago
Reinforcement learning for parameter estimation in statistical spoken dialogue systems
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...
Filip Jurcícek, Blaise Thomson, Steve Young
110
Voted
CORR
2012
Springer
229views Education» more  CORR 2012»
13 years 8 months ago
Cops and Invisible Robbers: the Cost of Drunkenness
We examine a version of the Cops and Robber (CR) game in which the robber is invisible, i.e., the cops do not know his location until they capture him. Apparently this game (CiR) h...
Athanasios Kehagias, Dieter Mitsche, Pawel Pralat
84
Voted
ATAL
2009
Springer
15 years 7 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh