adjtomo/seisflows

methodology for line search inversion using the gradient

Opened this issue · 2 comments

Hello, I am trying to run the 2D example walkthrough. I am interested in learning the math/methodology behind the line search / bracket search method used in this example for the inversion step.

It seems in this step "scaling gradient to absolute model perturbations", the misfit kernel (sum of kernels for individual sources) in the scratch/eval_grad/misfit_kernel/ directory is simply multiplied with model (Vs) to produce the gradient in the scratch/eval_grad/gradient/ directory.
The misfit is the time integral of the square of the difference between observed and predicted waveforms.
The initial candidate value of the steplength alpha (1.51E+12 in the example) that should be multiplied with the gradient to produce the model perturbations appears to be 1/sum(G.^2) where G is the gradient (a 2D matrix).
I am trying to understand how next two considered values of alpha (1.21E+10, 1.95E+09) are produced.

2024-02-16 02:20:10 (I) | try: first evaluation, attempt guess step length, alpha=1.51E+12
2024-02-16 02:20:10 (I) | try: applying initial step length safegaurd as alpha has exceeded maximum step length, alpha_new=1.21E+10
2024-02-16 02:20:10 (D) | overwriting initial step length, alpha_new=1.95E+09

After this, it seems the alpha value is just multiplied by the golden ratio (1.62) again and again to find a local minima in the misfit.
If you can please point the subroutine that does these calculations or some documentation for these steps, it will be very helpful.
Thanks.

Hi @avinash07guddu, thanks for the question. Optimization and line search are handled by the optimize module and the underlying line_search tools, the hierarchy here goes: Workflow.Inversion -> Optimize.Gradient -> LineSearch.Bracket.

For your specific question, alpha is calculated by the line search

def calculate_step_length(self):
"""
Determines step length (alpha) and search status (status) using a
bracketing line search. Evaluates Wolfe conditions to determine if
a step length is acceptable.
.. note:
Available status returns are:
'TRY': try/re-try the line search as conditions have not been met
'PASS': line search was successful, you can terminate the search
'FAIL': line search has failed for one or more reasons.
:rtype: tuple (float, str)
:return: (alpha==calculated step length,
status==how to treat the next step count evaluation)
"""
# Determine the line search history
x, f, gtg, gtp, step_count, update_count = self.get_search_history()
self._print_stats(x, f)
# For the first inversion and initial step, set alpha manually
if step_count == 0 and update_count == 0:
# Based on idea from Dennis and Schnabel
alpha = gtg[-1] ** -1
logger.info(f"try: first evaluation, attempt guess step length, "
f"alpha={alpha:.2E}")
status = "TRY"
# For every iteration's initial step, set alpha manually
elif step_count == 0:
# Based on the first equation in sec 3.5 of Nocedal and Wright 2ed
idx = np.argmin(self.func_vals[:-1])
alpha = self.step_lens[idx] * gtp[-2] / gtp[-1]
logger.info(f"try: first step count of iteration, "
f"setting scaled step length, alpha={alpha:.2E}")
status = "TRY"
# If misfit is reduced and then increased, we've bracketed. Pass
elif _check_bracket(x, f) and _good_enough(x, f):
alpha = x[f.argmin()]
logger.info(f"pass: bracket acceptable and step length "
f"reasonable. returning minimum line search misfit.")
status = "PASS"
# If misfit is reduced but not close, set to quadratic fit
elif _check_bracket(x, f):
alpha = polynomial_fit(x, f)
logger.info(f"try: bracket acceptable but step length unreasonable "
f"attempting to re-adjust step length "
f"alpha={alpha:.2E}")
status = "TRY"
# If misfit continues to step down, increase step length
elif step_count < self.step_count_max and all(f <= f[0]):
alpha = 1.618034 * x[-1] # 1.618034 is the 'golden ratio'
logger.info(f"try: misfit not bracketed, increasing step length "
f"using golden ratio, alpha={alpha:.2E}")
status = "TRY"
# If misfit increases, reduce step length by backtracking
elif step_count < self.step_count_max:
slope = gtp[-1] / gtg[-1]
alpha = parabolic_backtrack(f0=f[0], g0=slope, x1=x[1],
f1=f[1], b1=0.1, b2=0.5)
logger.info(f"try: misfit increasing, attempting "
f"to reduce step length using parabloic backtrack, "
f"alpha={alpha:.2E}")
status = "TRY"
# step_count_max exceeded, fail
else:
logger.info(f"fail: bracketing line search has failed "
f"to reduce the misfit before exceeding "
f"`step_count_max`={self.step_count_max}")
alpha = None
status = "FAIL"
# Apply optional step length safeguard to step length
if status == "TRY":
if alpha > self.step_len_max and step_count == 0:
alpha = 0.618034 * self.step_len_max
logger.info(f"try: applying initial step length "
f"safegaurd as alpha has exceeded maximum step "
f"length, alpha_new={alpha:.2E}")
# Stop because safeguard prevents us from going further
elif alpha > self.step_len_max:
alpha = self.step_len_max
logger.info(f"try: applying step length safegaurd as alpha has "
f"exceeded maximum allowable step length, "
f"alpha_new={alpha:.2E}")
return alpha, status

But the step length safeguard happens in the optimization module:

if self.step_len_max:
new_step_len_max = self.step_len_max * norm_m / norm_p
self._line_search.step_len_max = new_step_len_max
logger.info(f"enforcing max step length safeguard")

Hopefully that answers your question, or provides you with the correct files to look at!

Hi @avinash07guddu, just wondering if the above reply answered your question and we can close this issue?