A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes

Computer Science – Artificial Intelligence

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

8 pages, 3 figures, 3 tables

Scientific paper

Adaptive control problems are notoriously difficult to solve even in the presence of plant-specific controllers. One way to by-pass the intractable computation of the optimal policy is to restate the adaptive control as the minimization of the relative entropy of a controller that ignores the true plant dynamics from an informed controller. The solution is given by the Bayesian control rule-a set of equations characterizing a stochastic adaptive controller for the class of possible plant dynamics. Here, the Bayesian control rule is applied to derive BCR-MDP, a controller to solve undiscounted Markov decision processes with finite state and action spaces and unknown dynamics. In particular, we derive a non-parametric conjugate prior distribution over the policy space that encapsulates the agent's whole relevant history and we present a Gibbs sampler to draw random policies from this distribution. Preliminary results show that BCR-MDP successfully avoids sub-optimal limit cycles due to its built-in mechanism to balance exploration versus exploitation.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-582847

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.