Computer Science – Learning
Scientific paper
2007-04-11
Computer Science
Learning
9 pages, 5 figures
Scientific paper
LSTD is numerically instable for some ergodic Markov chains with preferred visits among some states over the remaining ones. Because the matrix that LSTD accumulates has large condition numbers. In this paper, we propose a variant of temporal difference learning with high data efficiency. A class of preconditioned temporal difference learning algorithms are also proposed to speed up the new method. It includes LSPE, and several new data efficient algorithms. The data efficiency of these algorithms is validated by learning an absorbing Markov chain. Also, the asymptotic properties of the new algorithms are analyzed.
No associations
LandOfFree
Preconditioned Temporal Difference Learning does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Preconditioned Temporal Difference Learning, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Preconditioned Temporal Difference Learning will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-673233