Sciweavers

29 search results - page 4 / 6
» Balancing Exploration and Exploitation: A New Algorithm for ...
Sort
View
TSP
2012
12 years 1 months ago
Sensing and Probing Cardinalities for Active Cognitive Radios
—In a cognitive radio network, opportunistic spectrum access (OSA) to the underutilized spectrum involves not only sensing the spectrum occupancy but also probing the channel qua...
Thang Van Nguyen, Hyundong Shin, Tony Q. S. Quek, ...
ML
2002
ACM
121views Machine Learning» more  ML 2002»
13 years 5 months ago
Near-Optimal Reinforcement Learning in Polynomial Time
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...
Michael J. Kearns, Satinder P. Singh
ICML
2006
IEEE
14 years 6 months ago
An analytic solution to discrete Bayesian reinforcement learning
Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms co...
Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, Kevi...
EUROCAST
2007
Springer
182views Hardware» more  EUROCAST 2007»
13 years 12 months ago
A k-NN Based Perception Scheme for Reinforcement Learning
Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...
José Antonio Martin H., Javier de Lope Asia...
ATAL
2010
Springer
13 years 6 months ago
Learning context conditions for BDI plan selection
An important drawback to the popular Belief, Desire, and Intentions (BDI) paradigm is that such systems include no element of learning from experience. In particular, the so-calle...
Dhirendra Singh, Sebastian Sardiña, Lin Pad...