Sciweavers

82 search results - page 2 / 17
» Balancing Exploration and Exploitation in Learning to Rank O...
Sort
View
89
Voted
GECCO
2006
Springer
133views Optimization» more  GECCO 2006»
15 years 1 months ago
On-line evolutionary computation for reinforcement learning in stochastic domains
In reinforcement learning, an agent interacting with its environment strives to learn a policy that specifies, for each state it may encounter, what action to take. Evolutionary c...
Shimon Whiteson, Peter Stone
126
Voted
PKDD
2010
Springer
179views Data Mining» more  PKDD 2010»
14 years 8 months ago
Gaussian Processes for Sample Efficient Reinforcement Learning with RMAX-Like Exploration
Abstract. We present an implementation of model-based online reinforcement learning (RL) for continuous domains with deterministic transitions that is specifically designed to achi...
Tobias Jung, Peter Stone
74
Voted
ICML
2005
IEEE
15 years 11 months ago
A theoretical analysis of Model-Based Interval Estimation
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Inter...
Alexander L. Strehl, Michael L. Littman
EMNLP
2009
14 years 8 months ago
Empirical Exploitation of Click Data for Task Specific Ranking
There have been increasing needs for task specific rankings in web search such as rankings for specific query segments like long queries, time-sensitive queries, navigational quer...
Anlei Dong, Yi Chang, Shihao Ji, Ciya Liao, Xin Li...
CVPR
2012
IEEE
13 years 3 months ago
Stream-based Joint Exploration-Exploitation Active Learning
Learning from streams of evolving and unbounded data is an important problem, for example in visual surveillance or internet scale data. For such large and evolving real-world data...
Chen Change Loy, Timothy M. Hospedales, Tao Xiang,...