Computer Science – Learning
Scientific paper
2006-11-29
Dans European Symposium on Artificial Neural Networks (2006)
Computer Science
Learning
Scientific paper
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(lambda), LSTD(lambda), iLSTD, residual-gradient TD. It is asserted that they all consist in minimizing a gradient function and differ by the form of this function and their means of minimizing it. Two new schemes are introduced in that framework: Full-gradient TD which uses a generalization of the principle introduced in iLSTD, and EGD TD, which reduces the gradient by successive equi-gradient descents. These three algorithms form a new intermediate family with the interesting property of making much better use of the samples than TD while keeping a gradient descent scheme, which is useful for complexity issues and optimistic policy iteration.
Loth Manuel
Preux Philippe
No associations
LandOfFree
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-372579