Inversionson is a workflow manager which automatically performs a Full-waveform inversion(FWI) of seismic data. It has built in support for various types of FWI workflows:
- A standard workflow like for example in Krischer et al, 2018
- The dynamic mini-batch workflow described by van Herwaarden et al, 2020
- The wavefield-adapted mesh workflow described by Thrastarson et al, 2020
- A combination of dynamic mini-batches and wavefield adapted meshes.
It has built in support for using a validation dataset, which is a dataset that is not explicitly a part of the inversion but is reserved for monitoring the inversion procedure with an independent dataset. The validation dataset can be used to tune regularization for example. There is also support for reserving a test dataset to compute misfit for at the end of the inversion process.
The inversion will be automatically documented in an easily readable MarkDown file. By setting some environment variables regarding Twilio, Inversionson can send you a Whatsapp message after each iteration.
Inversionson is not a standalone package but rather a wrapper around a few key FWI software. It is thus important to have these software installed and available if one is interested in using Inversionson. The main, non-standard software packaged that Inversionson requires are LASIF, MultiMesh and Salvus. The recommended way of installing Inversionson is to follow the installation instructions of LASIF, cloning the MultiMesh repository and installing that one, do the same thing with Inversionson, and finally install Salvus into the same environment.
Using Inversionson may seem a bit complicated at first, but once you get it working, it tends to run pretty smoothly. There are plans to make initializing an Inversionson project a much smoother process but that has not been done yet. The following is a description of how one can start an automatic FWI, using Inversionson.
A process which should get your project going:
- Create the directory where you want to host your project.
- Use LASIF to initialize a lasif project or copy an existing project in here.
- Finish seting up the LASIF project (define domain, download data, define frequency range of synthetics)
- Create a directory called
SALVUS_OPT
inside the Inversionson project directory- This directory is where the L-BFGS optimization routine will be carried out.
- Move into this directory
- Inside the
SALVUS_OPT
directory you need to run this code:
<Path to your Salvus binary> invert -x <Path to this folder>
- Now create a file called
run_salvus_opt.sh
which has only one line in it:
<Path to your Salvus binary> invert -i ./inversion.toml
-
Salvus opt should now have created some files.
- One of them is
inversion.toml
and you need to fill in some fields there, like initial model, parameters to invert for and whether you want to use batches of data or full gradients. - It's hard to assist with this file as it really depends on what you want to do but feel free to contact me if you are having troubles
- One of them is
-
Once you have filled in the
inversion.toml
file you should run
sh run_salvus_opt.sh
- Now go back to the Inversionson folder and run:
python <Path to inversionson folder>/inversionson/create_dummy_info_file.py
-
Fill in the relevant fields in the
inversion_info.toml
file properly. The file contains comments to explain the fields but some of them will be further explained here.- inversion_mode: Can be either "mini-batch" (dynamic mini-batches) or "mono-batch" (full gradients)
- meshes: Can be either "multi-mesh" (wavefield adapted meshes) or "mono-mesh" (same mesh for every simulation, defined by lasif domain file)
- model_interpolation_mode: Actually only supports "gll_2_gll" right now so don't worry about that as long as you use
hdf5
meshes. - inversion_parameters: Parameters to invert for. Make sure these are the same ones as in the
inversion.toml
file in theSALVUS_OPT
directory. - modelling_parameters: The parameters on the meshes you use for simulations.
- n_random_events: Only relevant for "mini-batch" mode. Describes how many of the events selected in each batch are random, vs how many are selected based on spatial coverage.
- min_ctrl_group_size: The minimum number of events used in control group, again only relevant for "mini-batch" mode.
- max_angular_change: Used to decide how many events make it to the control group for the coming iteration in "mini-batch" mode.
- dropout_probability: A form of regularization. Events in control group can be randomly dropped out with this probability so they don't get stuck there.
- initial_batch_size: Make sure it's the same as in
inversion.toml
in "mini-batch" mode. - cut_source_region_from_gradient_in_km: Gradients become unphysical next to the source and it can be good to cut the region out.
- cut_receiver_region_from_gradient_in_km: The same except receivers, and not nearly as bad of an unphysicality effect. This is currently quite slow and I would recommend just putting 0.0 here.
- clip_gradient: You can clip gradient at some percentile so that the highest/lowest values are removed. 1.0 doesn't clip at all.
- absorbing_boundaries: This is only a True/False flag, the actual absorbing boundaries are configured in the
lasif_config.toml
- elements_per_azimuthal_quarter: Only relevant for "multi-mesh". Decides how many elements are used to sample the azimuthal dimension. See paper.
- smoothing_mode: isotropic or anisotropic. It's always model dependent and can be either direction dependent or not.
- smoothing_lengths: How many wavelengths to smooth. If anisotropic the three values are: radial, lat, lon. For isotropic, only input one value.
- iterations_between_validation_checks: When using a validation dataset, this decides with how many iterations are between each validation check. The models between checks are averaged. 0 means no check.
- validation_dataset: Just a list of events in your lasif project that you want to reserve for validation checks and will not be used in the inversion. Input event names.
- test_dataset: Same principle as with the validation_dataset
- HPC.wave_propagation: Settings regarding wavefield simulations. Inversionson asks for double that walltime in adjoint runs as they are more expensive
- HPC.diffusion_equation: Settings regarding the smoothing computations.
-
As the file is configured you should be able to start running Inversionson.
- I would recommend running Inversionson with tmux as it keeps your shell running although you loose a connection with your computer or accidentally close your terminal window.
-
Run inversionson with this command:
python -m inversionson.autoinverter
For any questions feel free to contact soelvi.thrastarson@erdw.ethz.ch