Sciweavers

51 search results - page 10 / 11
» Exponentiated Gradient Methods for Reinforcement Learning
Sort
View
WEBDB
2010
Springer
155views Database» more  WEBDB 2010»
13 years 11 months ago
Learning Topical Transition Probabilities in Click Through Data with Regression Models
The transition of search engine usersā€™ intents has been studied for a long time. The knowledge of intent transition, once discovered, can yield a better understanding of how diļ...
Xiao Zhang, Prasenjit Mitra
NETCOOP
2007
Springer
14 years 11 days ago
Load Shared Sequential Routing in MPLS Networks: System and User Optimal Solutions
Recently Gerald Ash has shown through case studies that event dependent routing is attractive in large scale multi-service MPLS networks. In this paper, we consider the application...
Gilles Brunet, Fariba Heidari, Lorne Mason
COLT
2006
Springer
13 years 10 months ago
Logarithmic Regret Algorithms for Online Convex Optimization
In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After ea...
Elad Hazan, Adam Kalai, Satyen Kale, Amit Agarwal
ATAL
2010
Springer
13 years 7 months ago
Learning multi-agent state space representations
This paper describes an algorithm, called CQ-learning, which learns to adapt the state representation for multi-agent systems in order to coordinate with other agents. We propose ...
Yann-Michaël De Hauwere, Peter Vrancx, Ann No...
UAI
2003
13 years 7 months ago
On the Convergence of Bound Optimization Algorithms
Many practitioners who use EM and related algorithms complain that they are sometimes slow. When does this happen, and what can be done about it? In this paper, we study the gener...
Ruslan Salakhutdinov, Sam T. Roweis, Zoubin Ghahra...