/PT-AD

Pseudo-transient auto-diff playground

Primary LanguageJuliaMIT LicenseMIT

PT-AD

The aim of this repository is to combine solvers using the Pseudo-Transient Method with Automatic Differentiation (AD) tools, here Enzyme.jl, on GPUs in order to retrieve necessary objects to perform inversions or parameter optimisation.

💡 See our JuliaCon22 talk for details.

As proof of concept, we successfully used AD to:

  1. retrieve the residual of the Adjoint variable we can then use in the Pseudo-Transient solving procedure in order to solve for the Adjoint problem, and to;
  2. retrieve the gradient of the cost-function with respect to the parameter we are inverting/optimising for.

⚠️ Note: currently, the master branch of Enzyme.jl is needed to successfully execute the presented scripts. Consider ] add Enzyme#master within Julia's pkg manager.

Benchmarks

The benchmark codes are available in the scripts_ok folder and execute on Nvidia GPUs if cuda is appended in their name, and on the CPU otherwise. The adjoint_nonlin_diff_1D_v2.jl script provides a direct comparison between the CPU and GPU implementation.

1D power-law diffusion

We tested our workflow in a 1D configuration in which we try to invert for the power-law exponent of a nonlinear diffusion equation as described in Reuber et al. 2020 - Appendix and Reuber 2021. We used a gradient descent approach to minimise the misfit between observed (synthetic) and calculated quantity H to retrieve the optimal power-law exponent n:

Inverting for power-law exponent in 1D

code: adjoint_nonlin_diff_1D_v2.jl, adjoint_nonlin_diff_1D_v2_cuda.jl

Shallow-Ice

We tested our approach to invert for the equilibrium line (ELA) or the "mass-balance gradient" in an ice-flow model as described by Visnjevic et al. 2018, both in 1D and 2D. For testing purpose, we use a slightly more synthetic configuration:

Inverting forMB gradient in 2D

code: adjoint_nonlin_diff_react_2D_cuda.jl

1D inversion for ELA

The ELA inversion code is available only in 1D and was used to prototype the 2D SIA inversion.

code: adjoint_nonlin_diff_react_1D_cuda_ELA.jl

Note on performance

Effective memory throughput

Performance-wise, we compared the efficiency (measuring the effective memory throughput $T_\mathrm{eff}$ in GB/s) of the forward and adjoint solver. In addition, we also report the time per iteration as function of numerical grid resolution for both solvers.

Effective memory throughput

Time per iteration

In addition, we also "naively" compared the execution time per forward solve iteration in seconds when executing both on the GPU and on the CPU. There we see about 2 orders of magnitude speed-up.

CPU vs GPU timing

References

Giles, M. B., & Pierce, N. A. (2000). An Introduction to the Adjoint Approach to Design. Flow, Turbulence and Combustion, 65(3/4), 393–415. https://doi.org/10.1023/A:1011430410075

Reuber, G. S. (2021). Statistical and deterministic inverse methods in the geosciences: Introduction, review, and application to the nonlinear diffusion equation. GEM - International Journal on Geomathematics, 12(1), 19. https://doi.org/10.1007/s13137-021-00186-y