Simpler near-optimal controllers through direct supervision

Mathematics – Optimization and Control

Scientific paper

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Simpler near-optimal controllers through direct supervision Simpler near-optimal controllers through direct supervision

: 2009-08-20
: arxiv.org/abs/0908.2859v1
: Mathematics
: Optimization and Control

: Scientific paper
: The method of generalized Hamilton-Jacobi-Bellman equations (GHJB) is a powerful way of creating near-optimal controllers by learning. It is based on the fact that if we have a feedback controller, and we learn to compute the gradient grad-J of its cost-to-go function, then we can use that gradient to define a better controller. We can then use the new controller's grad-J to define a still-better controller, and so on. Here I point out that GHJB works indirectly in the sense that it doesn't learn the best approximation to grad-J but instead learns the time derivative dJ/dt, and infers grad-J from that. I show that we can get simpler and lower-cost controllers by learning grad-J directly. To do this, we need teaching signals that report grad-J(x) for a varied set of states x. I show how to obtain these signals, using the GHJB equation to calculate one component of grad-J(x) -- the one parallel with dx/dt -- and computing all the other components by backward-in-time integration, using a formula similar to the Euler-Lagrange equation. I then compare this direct algorithm with GHJB on 2 test problems.

Affiliated with

Tweed Douglas

Mathematics – Optimization and Control

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Simpler near-optimal controllers through direct supervision does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Simpler near-optimal controllers through direct supervision, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Simpler near-optimal controllers through direct supervision will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFWR-SCP-O-525958

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure