The UCT algorithm learns a value function online using sample-based search. The TD() algorithm can learn a value function offline for the on-policy distribution. We consider three...
We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree...
This paper summarizes recent advances in the application of multiagent coordination algorithms to air traffic flow management. Indeed, air traffic flow management is one of the fu...
Models of agent-environment interaction that use predictive state representations (PSRs) have mainly focused on the case of discrete observations and actions. The theory of discre...
Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization err...