milzj/FW4PDE

Implement optimal step size

Closed this issue · 3 comments

milzj commented

If we minimize $f(\cdot) + g(\cdot)$ and $f$ is quadratic, we have with $d = v-u$

$$ f(u+t(v-u)) = f(u+tv) = f(u) + t f'(u)d + t^2/2 f''(u)d^2. $$

Can we minimize

$$ f(u+tv) = f(u) + t f'(u)d + t^2/2 f''(u)d^2 + g(u+td) $$

over $t \in [0,1]$? Subgradient inequality yields lower bound on $g(u+td)$: $g(u+td) - g(u) \geq t(h, d)$. We could also minimize

$$ f(u+tv) = f(u) + t f'(u)d + t^2/2 f''(u)d^2 $$

over $t \in [0,1]$ and use the solution as step size.

milzj commented

If $g$ is Lipschitz continuous with Lipschitz constant $L$, we could also use $g(u+td) \leq g(u) + L t |d|$

milzj commented

Use $g(u+t(v-u)) = g(tv + (1-t)u) \leq tg(v) + (1-t) g(u)$

milzj commented

Define $\phi = f + g$ and let $v_k$ be the solution to the subproblem. Define $\ell_k(x) = f(x_k) + (\nabla f'(x_k), x-x_k) + g(x)$. We have $\ell_k(v_k) = f(x_k) - \Psi(x_k) + g(x_k)$, where $\Psi$ is the gap functional. Suppose that $f$ is a smooth, convex, quadratic function with Hessian $H$ and define $d_k = v_k-x_k$. Following the computations in A. Ben-Tal, A. Nemirovski, Lectures on Modern Convex Optimization 2021/2022, we have

$$ \phi(x_k+\gamma_kd_k) = \ell_k(x_k+\gamma_kd_k) + [f(x_k+\gamma_kd_k) - f(x_k) - (\nabla f(x_k), \gamma_kd_k)] \leq (1-\gamma_k)\ell_k(x_k) + \gamma_k \ell_k(v_k) + (1/2) \gamma_k^2 H_k d_k^2 = (1-\gamma_k)\phi(x_k) + \gamma_k[f(x_k) - \Psi(x_k) +g(x_k)] + (1/2) \gamma_k^2 H_k d_k^2 $$

Morever $(1-\gamma_k) \phi(x_k) + \gamma_k f(x_k) + \gamma_k g(x_k) = f(x_k) + g(x_k)$.

Hence

$\phi(x_k+\gamma_kd_k) \leq \phi(x_k) + \gamma_k (-\Psi(x_k)) + (1/2) \gamma_k^2 H_k d_k^2$.

The rhs can be minimized over $\gamma_k \in [0,1]$.