Accompanying code and results for the paper: "1D Ice Shelf Hardness Inversion: Clustering Behavior and Collocation Resampling in Physics-Informed Neural Networks" by Yunona Iwasaki and Ching-Yao Lai. We document the codes used to train PINNs for 1D ice-shelf inverse modeling. Additionally, we provide scripts to facilitate analysis of training results over repeated trials.
Please direct questions about this code and documentation to Ching-Yao Lai (cyaolai@stanford.edu) and Yunona Iwasaki (yiwasaki@berkeley.edu).
In our study, we investigated the feasibility of a physics-informed neural network to invert for the correct ice-shelf hardness spatial profile
Thus, we introduced a weighting hyperparameter
Indeed, we observed that the attainable predictive accuracy of PINNs improves significantly for higher values of
Figure 1. The clustering behaviour of PINN predictions. Correlation of
Given this problematic behavior, we conclude that setting
Figure 2. Distribution of
Our script should be run in a conda environment using one of the .yml
files in the env
folder. To install conda, please follow the instructions here.
Then, create a conda environment by running the command:
conda env create -f environment.yml
Replace environment.yml
with the environment file appropriate for your platform:
environment-cluster.yml: For running using Tensorflow 2.4 on an HPC cluster (CPU only)
environment-osx.yml: For running on Mac OS using Tensorflow 2.4
environment-windows.yml: For running on Windows using Tensorflow 2.5 (Tensorflow 2.4 is unsupported on Windows)
Main script for training PINNs to predict for the correct 1D
This script requires the user to specify the ground truth .mat
file using the N_t
variable (line 35). The ground truth B_truth
variable (line 39). B_truth
should contain the ground truth values of
Additionally, this script requires users to specify the following hyperparameters relevant to our optimization study:
-
N_t (int): Number of collocation points. This number stays fixed, even if the script switches between collocation resampling and fixed collocation points (line 43).
-
layers (list): List specifying the width and depth of the neural network. Specify the size of each layer except for the input layer. e.g.
layers = [5,5,3]
for a neural network with two, 5-unit hidden layers. The final value specifies the size of the output layer and must be set to 3 for this problem. (line 47) -
num_iterations_adam_resampled, num_iterations_adam_fixed, num_iterations_lbfgs (int): Specify the number of iterations to train with each optimizer and collocation method (lines 53-55).
-
adam_resampled
: train with Adam optimizer using collocation resampling. -
adam_fixed
: train with Adam optimizer wih fixed collocation points -
lbfgs
: train with L-BFGS optimizer with fixed collocation points.
-
-
test_noise (float): level of noise added to ground truth
$u(x)$ and$h(x)$ profiles during synthetic data generation. Please refer to p. 6 of the main text for the definition of noise level; it may also be helpful to see its implementation in the scriptnoise.py
.
Note: We have not included the option to run L-BFGS with collocation resampling, as L-BFGS is a second-order optimization algorithm (i.e. the update to the neural network weights is determined by the two preceding iterations); training will quickly terminate if this is attempted.
- test_gammas (list): specify one or multiple values of
$\gamma$ to test. To conveniently implement logarithmic spacing of$\frac{\gamma}{1-\gamma}$ , the user may the specify$\log_{10}(\frac{\gamma}{1-\gamma})$ values using thelogratios
variable, then solve for the corresponding$\gamma$ -values in the next line (lines 221-222).
Additional information can be found in the line-by-line explanations provided in the code comments.
Contains the loss functions used in the paper, namely SquareLoss
used for fixed collocation points, and SquareLossRandom
used for collocation resampling. Please refer to the final section of this README ("Code Implementation of Collocation Resampling") for a detailed explanation of how these two functions differ.
Both loss functions evaluate the predictive accuracy of the neural network after each iteration according to the characteristic objective function of PINN, which we call
where we introduce an additional hyperparameter formulations/eqns_o1_inverse.py
for the implementation of these equations in our codes.
An instance of the SquareLoss
function is initialized by the following code:
loss = SquareLoss(equations=physics_equations, equations_data=data_equations, gamma=gamma)
where
- equations: An iterable of callables with the signature
function(x, neuralnet)
corresponding to the governing physics equations. To enforce 1D SSA, we passInverse_1stOrder_Equations
imported fromformulations/eqns_o1_inverse.py
. - equations_data: An iterable of callables with the signature
function(x, neuralnet)
corresponding to the governing physics equations. We useData_Equations
imported fromformulations/eqns_o1_inverse.py
. - gamma (float): the value of
$\gamma$ with which to evaluate the objective function$J(\Theta)$ .
SquareLossRandom
is initialized with the same arguments.
constants.py
: defines the values of the physical constants appearing in the physics-enforcing equations.eqns_o1_inverse.py
: implements the PINN equations for the ice shelf hardness inversion problem (see Equations (17)-(20), p.5 of the main text).helpers.py
: some additional helper functions.
Includes helper functions relevant for synthetic training data generation.
Implements Adam and L-BFGS optimizers.
Helper functions for neural network initialization.
Ground truth profiles for
Jupyter notebook for consolidating error data from a set of trial result dictionaries into a single numpy array.
Jupyter notebook that loads the numpy error array of a set of training trials and separates trials by
Jupyter notebook for visualizing the
Numpy arrays of the
errors = np.load('path_to_file/errors.npy')
gi = 1
u_errs = errors[gi][0]
h_errs = errors[gi][1]
B_errs = errors[gi][2]
where gi
should be modified to the index of the
Naming conventions for each numpy array are as follows:
Error results using the standard settings(six, 20-unit hidden layers with
Results from training with collocation resampling followed by fixed collocation points for noise level gi = 0
corresponds to gi = 1
corresponds to gi = 11
corresponds to
Results from tests with increased neural network width. ux corresponds to x-units per hidden layer. (Noise level = 0.3 for both)
Results from final test with two 100-hidden units and 1001 collocation points. (Noise level = 0.3)
Training can be switched between using fixed collocation points and collocation resampling by switching the loss function used during training. The loss function evaluated by a given optimizer is specified during the initialization of the optimizer. Use the SquareLoss
loss function when using fixed collocation points, and SquareLossRandom
for random collocation resampling (see lines 77-100 in 'pinn_trials.py').
Comparing the SquareLoss
and SquareLossRandom
functions in 'loss.py', the main difference between the two functions is in the __call__
method. For SquareLossRandom
, we add a few extra lines at the beginning of the __call__
method (lines 54-61):
def __call__(self, x_eqn, data_pts, net) -> Dict[str, tf.Tensor]:
xmin = 0.0
xmax = 1.0
N_t = 1001
_data_type = tf.float64
collocation_pts = xmin + (xmax - xmin) * self.col_gen.uniform(shape = [N_t])
collocation_pts = collocation_pts**3
where self.col_gen
is a stateful random generator defined in the __init__
method (line 52):
self.col_gen = tf.random.get_global_generator()
Thus, the SquareLossRandom
function generates a new set of collocation points every time it is called, i.e. at every iteration.
Important Note: It is essential to use a stateful random number generator such as tf.random.Generator()
to ensure that the collocation points are resampled after each iteration. Using a stateless random generator (such as
those provided in the numpy.random
module, or the lhs
generator used in our codes for fixed collocation point generation) will not allow the collocation points to be updated in a TensorFlow training loop, causing the loss function to behave identically to training with fixed collocation points.
Yunona Iwasaki and Ching-Yao Lai. One-dimensional ice shelf hardness inversion: Clustering behavior and collocation resampling in physics-informed neural networks. Journal of Computational Physics, Volume 492, 2023, 112435, ISSN 0021-9991, https://doi.org/10.1016/j.jcp.2023.112435.
BibTex:
@article{IWASAKI2023112435,
title = {One-dimensional ice shelf hardness inversion: Clustering behavior and collocation resampling in physics- informed neural networks},
journal = {Journal of Computational Physics},
volume = {492},
pages = {112435},
year = {2023},
issn = {0021-9991},
doi = {https://doi.org/10.1016/j.jcp.2023.112435},
url = {https://www.sciencedirect.com/science/article/pii/S0021999123005302},
author = {Yunona Iwasaki and Ching-Yao Lai},
keywords = {Physics-informed neural networks, Scientific machine learning, Ice dynamics, Geophysical fluid dynamics, Nonlinear dynamics, Inverse problems},
}