/Inversionson_fork

Full-Waveform Inversion workflow manager which automates the inversion process.

Primary LanguagePythonMIT LicenseMIT

Inversionson

Inversionson is a workflow manager which automatically performs a Full-waveform inversion(FWI) of seismic data. It has built in support for various types of FWI workflows:

It has built in support for using a validation dataset, which is a dataset that is not explicitly a part of the inversion but is reserved for monitoring the inversion procedure with an independent dataset. The validation dataset can be used to tune regularization for example. There is also support for reserving a test dataset to compute misfit for at the end of the inversion process.

The inversion will be automatically documented in an easily readable MarkDown file. By setting some environment variables regarding Twilio, Inversionson can send you a Whatsapp message after each iteration.

Central Libraries

Inversionson is not a standalone package but rather a wrapper around a few key FWI software. It is thus important to have these software installed and available if one is interested in using Inversionson. The main, non-standard software packaged that Inversionson requires are LASIF, MultiMesh and Salvus. The recommended way of installing Inversionson is to follow the installation instructions of LASIF, cloning the MultiMesh repository and installing that one, do the same thing with Inversionson, and finally install Salvus into the same environment.

Usage

Using Inversionson may seem a bit complicated at first, but once you get it working, it tends to run pretty smoothly. There are plans to make initializing an Inversionson project a much smoother process but that has not been done yet. The following is a description of how one can start an automatic FWI, using Inversionson.

A process which should get your project going:

  1. Create the directory where you want to host your project.
  2. Use LASIF to initialize a lasif project or copy an existing project in here.
    • Finish seting up the LASIF project (define domain, download data, define frequency range of synthetics)
  3. Create a directory called SALVUS_OPT inside the Inversionson project directory
    • This directory is where the L-BFGS optimization routine will be carried out.
    • Move into this directory
  4. Inside the SALVUS_OPT directory you need to run this code:
<Path to your Salvus binary> invert -x <Path to this folder>
  1. Now create a file called run_salvus_opt.sh which has only one line in it:
<Path to your Salvus binary> invert -i ./inversion.toml
  1. Salvus opt should now have created some files.

    • One of them is inversion.toml and you need to fill in some fields there, like initial model, parameters to invert for and whether you want to use batches of data or full gradients.
    • It's hard to assist with this file as it really depends on what you want to do but feel free to contact me if you are having troubles
  2. Once you have filled in the inversion.toml file you should run

sh run_salvus_opt.sh
  1. Now go back to the Inversionson folder and run:
python <Path to inversionson folder>/inversionson/create_dummy_info_file.py
  1. Fill in the relevant fields in the inversion_info.toml file properly. The file contains comments to explain the fields but some of them will be further explained here.

    • inversion_mode: Can be either "mini-batch" (dynamic mini-batches) or "mono-batch" (full gradients)
    • meshes: Can be either "multi-mesh" (wavefield adapted meshes) or "mono-mesh" (same mesh for every simulation, defined by lasif domain file)
    • model_interpolation_mode: Actually only supports "gll_2_gll" right now so don't worry about that as long as you use hdf5 meshes.
    • inversion_parameters: Parameters to invert for. Make sure these are the same ones as in the inversion.toml file in the SALVUS_OPT directory.
    • modelling_parameters: The parameters on the meshes you use for simulations.
    • n_random_events: Only relevant for "mini-batch" mode. Describes how many of the events selected in each batch are random, vs how many are selected based on spatial coverage.
    • min_ctrl_group_size: The minimum number of events used in control group, again only relevant for "mini-batch" mode.
    • max_angular_change: Used to decide how many events make it to the control group for the coming iteration in "mini-batch" mode.
    • dropout_probability: A form of regularization. Events in control group can be randomly dropped out with this probability so they don't get stuck there.
    • initial_batch_size: Make sure it's the same as in inversion.toml in "mini-batch" mode.
    • cut_source_region_from_gradient_in_km: Gradients become unphysical next to the source and it can be good to cut the region out.
    • cut_receiver_region_from_gradient_in_km: The same except receivers, and not nearly as bad of an unphysicality effect. This is currently quite slow and I would recommend just putting 0.0 here.
    • clip_gradient: You can clip gradient at some percentile so that the highest/lowest values are removed. 1.0 doesn't clip at all.
    • absorbing_boundaries: This is only a True/False flag, the actual absorbing boundaries are configured in the lasif_config.toml
    • elements_per_azimuthal_quarter: Only relevant for "multi-mesh". Decides how many elements are used to sample the azimuthal dimension. See paper.
    • smoothing_mode: isotropic or anisotropic. It's always model dependent and can be either direction dependent or not.
    • smoothing_lengths: How many wavelengths to smooth. If anisotropic the three values are: radial, lat, lon. For isotropic, only input one value.
    • iterations_between_validation_checks: When using a validation dataset, this decides with how many iterations are between each validation check. The models between checks are averaged. 0 means no check.
    • validation_dataset: Just a list of events in your lasif project that you want to reserve for validation checks and will not be used in the inversion. Input event names.
    • test_dataset: Same principle as with the validation_dataset
    • HPC.wave_propagation: Settings regarding wavefield simulations. Inversionson asks for double that walltime in adjoint runs as they are more expensive
    • HPC.diffusion_equation: Settings regarding the smoothing computations.
  2. As the file is configured you should be able to start running Inversionson.

    • I would recommend running Inversionson with tmux as it keeps your shell running although you loose a connection with your computer or accidentally close your terminal window.
  3. Run inversionson with this command:

python -m inversionson.autoinverter

For any questions feel free to contact soelvi.thrastarson@erdw.ethz.ch