Sciweavers

417 search results - page 56 / 84
» Reinforcement Learning Estimation of Distribution Algorithm
Sort
View
129
Voted
ICML
2007
IEEE
16 years 3 months ago
Combining online and offline knowledge in UCT
The UCT algorithm learns a value function online using sample-based search. The TD() algorithm can learn a value function offline for the on-policy distribution. We consider three...
Sylvain Gelly, David Silver
JMLR
2006
118views more  JMLR 2006»
15 years 2 months ago
Learning Factor Graphs in Polynomial Time and Sample Complexity
We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree...
Pieter Abbeel, Daphne Koller, Andrew Y. Ng
AAAI
2008
15 years 4 months ago
Adaptive Management of Air Traffic Flow: A Multiagent Coordination Approach
This paper summarizes recent advances in the application of multiagent coordination algorithms to air traffic flow management. Indeed, air traffic flow management is one of the fu...
Kagan Tumer, Adrian K. Agogino
ATAL
2007
Springer
15 years 6 months ago
On discovery and learning of models with predictive representations of state for agents with continuous actions and observations
Models of agent-environment interaction that use predictive state representations (PSRs) have mainly focused on the case of discrete observations and actions. The theory of discre...
David Wingate, Satinder P. Singh
UAI
2008
15 years 3 months ago
Small Sample Inference for Generalization Error in Classification Using the CUD Bound
Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization err...
Eric Laber, Susan Murphy