Blended Conditional Gradients
We present a blended conditional gradient algorithm for minimizing a smooth convex function over a polytope P, that combines gradient projection steps with conditional gradient steps, achieving linear convergence for strongly convex functions. It does not make use of away steps or pairwise steps, but retains all favorable properties of conditional gradient algorithms, most notably not requiring projections onto P and maintaining iterates as sparse convex combinations of extreme points. The algorithm decreases measures of optimality (primal and dual gaps) rapidly, both in the number of iterations and in wall-clock time, outperforming even the efficient lazified conditional gradient algorithms of Braun et al. [2017]. We also present a streamlined algorithm for the special case in which P is a probability simplex, called simplex gradient descent.