Computer Science – Systems and Control
Scientific paper
2012-02-23
Computer Science
Systems and Control
Scientific paper
We study the minmax optimization problem introduced in [22] for computing policies for batch mode reinforcement learning in a deterministic setting. First, we show that this problem is NP-hard. In the two-stage case, we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [22].
Boigelot Bernard
Ernst Damien
Fonteneau Raphael
Louveaux Quentin
No associations
LandOfFree
Min Max Generalization for Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Min Max Generalization for Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Min Max Generalization for Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-77550