Sciweavers

1310 search results - page 113 / 262
» Progressive Optimization in Action
Sort
View
152
Voted
ICML
2004
IEEE
16 years 5 months ago
Learning to fly by combining reinforcement learning with behavioural cloning
Reinforcement learning deals with learning optimal or near optimal policies while interacting with the environment. Application domains with many continuous variables are difficul...
Eduardo F. Morales, Claude Sammut
GLOBECOM
2008
IEEE
15 years 11 months ago
Foresighted Resource Reciprocation Strategies in P2P Networks
—We consider peer-to-peer (P2P) networks, where multiple peers are interested in sharing content. While sharing resources, autonomous and self-interested peers need to make decis...
Hyunggon Park, Mihaela van der Schaar
PRICAI
1999
Springer
15 years 9 months ago
Making Rational Decisions in N-by-N Negotiation Games with a Trusted Third Party
The optimal decision for an agent to make at a given game situation often depends on the decisions that other agents make at the same time. Rational agents will try to find a stabl...
Shih-Hung Wu, Von-Wun Soo
125
Voted
ML
2002
ACM
133views Machine Learning» more  ML 2002»
15 years 4 months ago
Finite-time Analysis of the Multiarmed Bandit Problem
Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...
Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...
AIPS
2008
15 years 7 months ago
A Compact and Efficient SAT Encoding for Planning
In the planning-as-SAT paradigm there have been numerous recent developments towards improving the speed and scalability of planning at the cost of finding a step-optimal parallel...
Nathan Robinson, Charles Gretton, Duc Nghia Pham, ...