Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints

Computer Science – Learning

Scientific paper

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints

: 2011-11-25
: arxiv.org/abs/1111.6082v1
: Computer Science
: Learning

: Scientific paper
: In this paper we propose a framework for solving constrained online convex optimization problem. Our motivation stems from the observation that most algorithms proposed for online convex optimization require a projection onto the convex set $\mathcal{K}$ from which the decisions are made. While for simple shapes (e.g. Euclidean ball) the projection is straightforward, for arbitrary complex sets this is the main computational challenge and may be inefficient in practice. In this paper, we consider an alternative online convex optimization problem. Instead of requiring decisions belong to $\mathcal{K}$ for all rounds, we only require that the constraints which define the set $\mathcal{K}$ be satisfied in the long run. We show that our framework can be utilized to solve a relaxed version of online learning with side constraints addressed in \cite{DBLP:conf/colt/MannorT06} and \cite{DBLP:conf/aaai/KvetonYTM08}. By turning the problem into an online convex-concave optimization problem, we propose an efficient algorithm which achieves $\tilde{\mathcal{O}}(\sqrt{T})$ regret bound and $\tilde{\mathcal{O}}(T^{3/4})$ bound for the violation of constraints. Then we modify the algorithm in order to guarantee that the constraints are satisfied in the long run. This gain is achieved at the price of getting $\tilde{\mathcal{O}}(T^{3/4})$ regret bound. Our second algorithm is based on the Mirror Prox method \citep{nemirovski-2005-prox} to solve variational inequalities which achieves $\tilde{\mathcal{\mathcal{O}}}(T^{2/3})$ bound for both regret and the violation of constraints when the domain $\K$ can be described by a finite number of linear constraints. Finally, we extend the result to the setting where we only have partial access to the convex set $\mathcal{K}$ and propose a multipoint bandit feedback algorithm with the same bounds in expectation as our first algorithm.

Affiliated with

Jin Rong

Computer Science – Learning

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mahdavi Mehrdad

Computer Science – Learning

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Yang Tianbao

Computer Science – Learning

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFWR-SCP-O-174984

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure