lorenz05-cpp-acceleration

简体中文

Python package for data assimilation experiments with the Lorenz 05 model, with cpp acceleration.

Updates

2023.10.17 v1.0.2: Add custom initial condition data and natural run data generation.

.npy format inital data and natural run (truth, observation) data now can be customized generated by DataGenerator. See Usage for details.

2023.10.11 v1.0.1: Model advance rewritten in C++ with multi-threading.

The Lorenz 05 model advance is now rewritten in C++ with multi-threading, resulting in a 10x speedup compared to the original Python code. A 2000 ensemble size, 5 years EnKF experiment can be finished with 4-5h. (Tested on Intel Core i9-13900K, 6400MHz 64G DDR5 RAM)

2023.10.8 v1.0.0: Initial release.

Features

assmanager encapsulates the Lorenz 05 model and data assimilation processes, offering high usability and extensibility.

Define custom experiment parameters using either ini files or Python dictionaries for easy execution of multiple experiments.
Each experiment's results and parameters are saved in a separate folder for easy replication.
Modular design for easy addition of new data assimilation methods and inflation/localization schemes.
High speed. The Lorenz 05 model advance is written in C++.

Installation

WARNING: This package is only tested on Python 3.11, the use of other Python versions may cause errors.

Install

Install assmanager using the following command:

git clone https://github.com/zyzhao0926/L05-cpp-acceleration.git

Go to L05-cpp-acceleration/assmanager/model/step_L04/cpu_parallel/, then compile the cpp extension:

make clean
make

Back to L05-cpp-acceleration, if you want to install this package in a conda virtual environment, make sure you activate your conda env by conda activate your_conda_env. If not, ignore this step.

python setup.py sdist
pip install dist/assmanager-1.0.1.tar.gz

The dependencies will be installled automatically.

Uninstall

To uninstall assmanager, use:

pip uninstall assmanager

Quick Start

from assmanager import AssManager

am = AssManager()
am.run()

Running the above code will perform a data assimilation experiment with default parameters for the Lorenz 05 model.

Usage

First, you need to set experiment parameters. Assmanager supports setting experiment parameters using Python dictionaries or ini configuration files.

Setting Experiment Parameters

When using a dictionary, the structure passed to AssManager is as follows:

config = {
    'model_params': {
        'model_size': 960, 
        'forcing': 15.00,
        'time_steps': 200 * 360, 
        ...
    },

    'DA_params': {
        'obs_density': 4,
        'obs_freq_timestep': 50,
        'obs_error_var': 1.0,
    },

    'DA_option': {
        'save_prior_ensemble': False,
        'save_prior_mean': True,
        ...
    },

    'Input_file_paths': {
        'ics_path': 'data/ics_ms3_from_zt1year_sz3001.mat',
        ...
    },

    'IC_data': {
        'ics_imem_beg': 248,
        ...
    },

    'Experiment_option': {
        'verbose': True,
        'experiment_name': 'test',
        ...
    }
}

Instantiate AssManager with the config:

am = AssManager(config)

When using an ini configuration file, first edit the ini configuration file and then pass the ini file directory when instantiating AssManager:

config_path = './config.ini'
am = AssManager(config_path)

Generating Initial Conditions and Observation Data

Initial conditions and observation data can be generated using the DataGenerator class. The DataGenerator class supports generating initial conditions and observation data in .npy format.

from assmanager import DataGenerator

dg = DataGenerator({'forcing': 15})
dg.generate_ics(3001)
dg.generate_nr()

Running the above code will generate initial conditions and natural run data in your current directory. If you want to specify the save directory, you can pass the data_save_path parameter to the generate_ics and generate_nr methods.

Default Experiment Parameters and Meanings

When no custom parameter configuration is provided, assmanager will use the default config.ini file with the following content:

[model_params]
model_size = 960                    # N
forcing = 15.00                     # F
space_time_scale = 10.00            # b
coupling = 3.00                     # c
smooth_steps = 12                   # I
K = 32
delta_t = 0.001
time_step_days = 0
time_step_seconds = 432
time_steps = 200 * 360              # 1 day~ 200 time steps
model_number = 3                    # 2 scale

[DA_params]
model_size = 960
time_steps = 200 * 360              # 1 day~ 200 time steps
obs_density = 4
obs_freq_timestep = 50
obs_error_var = 1.0

[DA_config]
ensemble_size = 40
filter = EnKF
update_method = serial_update               # serial_update, parallel_update
inflation_method = multiplicative           # None, multiplicative
inflation_factor = 1.01
inflation_sequence = before_DA              # before_DA, after_DA
localization_method = GC                    # None, GC, CNN
localization_radius = 240

[DA_option]
save_prior_ensemble = False
save_prior_mean = True
save_analysis_ensemble = False
save_analysis_mean = True
save_observation = False
save_truth = False
save_kalman_gain = False
save_prior_rmse = True
save_analysis_rmse = True
save_prior_spread_rmse = True
save_analysis_spread_rmse = True
file_save_option = single_file

[Input_file_paths]
ics_path = /data1/zyzhao/scratch/data/ics_ms3_from_zt1year_sz3001.mat # directory must exist
ics_key = zics_total1                                                 # optionnal, if needed
obs_path = /data1/zyzhao/scratch/data/obs_ms3_err1_240s_6h_5y.mat     # directory must exist
obs_key = zobs_total                                                  # optionnal, if needed
truth_path = /data1/zyzhao/scratch/data/zt_25year_ms3_6h.mat          # directory must exist
truth_key = zens_times                                                # optionnal, if needed

[IC_data]
ics_imem_beg = 248

[Experiment_option]
verbose = True
experiment_name = default
result_save_path = /data1/zyzhao/L05_experiments                      # must be valid path
data_save_path = data                       # data folder name, choose what u like
file_save_type = npy                        # npy, mat, dat
prior_ensemble_filename = zens_prior        # prior ensemble file will be saved as zens_prior.npy
prior_mean_filename = prior                 # the same as above
analysis_ensemble_filename = zens_analy
analysis_mean_filename = analy
obs_filename = zobs
truth_filename = ztruth
kalman_gain_filename = kg
prior_rmse_filename = prior_rmse
analysis_rmse_filename = analy_rmse
prior_spread_rmse_filename = prior_spread_rmse
analysis_spread_rmse_filename = analy_spread_rmse

The parameter configuration is divided into seven major sections:

model_params: Configuration of parameters for the Lorenz 05 model.
DA_params: External parameter settings used for data assimilation, including the total number of integration steps, observation density, observation frequency, and more.
DA_config: Internal parameter settings for data assimilation, including ensemble size, data assimilation algorithm, inflation, and localization settings.
DA_option: Data saving switches to specify which data to save, including ensemble mean, Kalman gain, RMSE, and more.
Input_file_paths: Configuration for reading external data, such as initial conditions, observation data, and ground truth data file directories.
IC_data: Configuration for specifying which ensemble members are included in the initial conditions.
Experiment_option: Experiment name, experiment save directory, data save format, and file name settings.

Running Multiple Experiments

See assmanager/demo.py for an example of running multiple experiments at once.

# demo.py
from assmanager import AssManager

inflation_values = [1.05]
inflation_sequences = ['before_DA']
ensemble_size = 2000
forcings = [16, 15]
time_steps = 200 * 360 * 5
# time_steps = 200 * 20
configs = []

for inf in inflation_values:
    for seq in inflation_sequences:
        for forcing in forcings:
            configs.append(
                {
                    'model_params': {
                        'forcing': forcing,
                        'time_steps': time_steps,
                    },
                    
                    'DA_params': {
                        'time_steps': time_steps,
                    },
                    
                    'DA_config': {
                        'ensemble_size': ensemble_size,
                        'inflation_factor': inf,
                        'inflation_sequence': seq,
                        'filter': 'EnKF',
                    },
                    
                    'DA_option': {
                        'save_kalman_gain': True,
                        'save_prior_ensemble': True,
                        'save_analysis_ensemble': True,
                        'file_save_option': 'multiple_files',
                        # 'file_save_option': 'single_file',
                    },

                    'Experiment_option': {
                        'experiment_name': f'EnKF_F{forcing}_inf_{inf}_{seq}_sz{ensemble_size}_5y_cpptest',
                        'result_save_path': '/mnt/pve_nfs/zyzhao/L05_experiments',
                    }
                }
            )

ams = [AssManager(config) for config in configs]
for am in ams:
    am.run()

Extensions

This framework supports the extension of data assimilation (DA) algorithms, inflation methods, and localization methods.

inflation

To add a new inflation method, open filter/ensembleFilter.py, find the inflation function within the ensembleFilter class, and add an additional if statement for your new method:

# filter/ensembleFilter.py

def inflation(self, zens: np.mat) -> np.mat:
    """ inflation

    Args:
        zens (np.mat): state ensemble
        zens_prior (np.mat): prior state ensemble

    Returns:
        np.mat: inflated state ensemble
    """
    if self.inflation_method is None:
        return zens
    
    elif self.inflation_method == 'multiplicative':
        ens_mean = np.mean(zens, axis=0)
        ens_prime = zens - ens_mean
        zens_inf = ens_mean + self.inflation_factor * ens_prime
        return zens_inf
    
    elif self.inflation_method == 'your inflation method':
        # some calculations here
        # if you need previous prior and analysis mean, you can access them by self.prior_mean and self.analy_mean
        zens_inf = ...
        return zens_inf

localization

To add a new localization method, open filter/ensembleFilter.py, find the construct_GC_2d function, create a new function to compute the localization matrix ρ, and modify the __get_localization_matrix method in the ensembleFilter class to use your new localization method.

DA method

To add a new DA method, copy the EnKF.py file in the filter/ directory and rename it to your_DA_method.py. Modify the class name to yourDAMethod, and then make changes to the __serial_update and __parallel_update methods.

Author

Zhiyu Zhao
zyzh@smail.nju.edu.cn
qq: 605601949

Acknowledgments

Zhongrui Wang provided the L05 Python code with Numba acceleration and the DA program.

lantel-wm/L05-cpp-acceleration