We formalize the associative bandit problem framework introduced by Kaelbling as a learning-theory problem. The learning environment is modeled as a k-armed bandit where arm payof...
Alexander L. Strehl, Chris Mesterharm, Michael L. ...
The microarchitectural design space of a new processor is too large for an architect to evaluate in its entirety. Even with the use of statistical simulation, evaluation of a sing...
Christophe Dubach, Timothy M. Jones, Michael F. P....
The paper describes our first experiments on Reinforcement Learning to steer a real robot car. The applied method, Neural Fitted Q Iteration (NFQ) is purely data-driven based on ...
Martin Riedmiller, Michael Montemerlo, Hendrik Dah...
This paper proposes a novel Mass Spectrometry data profiling method for ovarian cancer detection based on negative correlation learning (NCL). A modified Smoothed Nonlinear Energy ...
The service discovery is a key step during Peer-to-Peer (P2P) converging with Web Service. In this paper, a semantic-P2P based approach is presented for web service discovery. To e...