Python package for data assimilation experiments with the Lorenz 05 model, with cpp acceleration.
.npy
format inital data and natural run (truth, observation) data now can be customized generated byDataGenerator
. See Usage for details.
- The Lorenz 05 model advance is now rewritten in C++ with multi-threading, resulting in a 10x speedup compared to the original Python code. A 2000 ensemble size, 5 years EnKF experiment can be finished with 4-5h. (Tested on Intel Core i9-13900K, 6400MHz 64G DDR5 RAM)
assmanager encapsulates the Lorenz 05 model and data assimilation processes, offering high usability and extensibility.
- Define custom experiment parameters using either ini files or Python dictionaries for easy execution of multiple experiments.
- Each experiment's results and parameters are saved in a separate folder for easy replication.
- Modular design for easy addition of new data assimilation methods and inflation/localization schemes.
- High speed. The Lorenz 05 model advance is written in C++.
WARNING: This package is only tested on Python 3.11, the use of other Python versions may cause errors.
Install assmanager
using the following command:
git clone https://github.com/zyzhao0926/L05-cpp-acceleration.git
Go to L05-cpp-acceleration/assmanager/model/step_L04/cpu_parallel/
, then compile the cpp extension:
make clean
make
Back to L05-cpp-acceleration
, if you want to install this package in a conda virtual environment, make sure you activate your conda env by conda activate your_conda_env
. If not, ignore this step.
python setup.py sdist
pip install dist/assmanager-1.0.1.tar.gz
The dependencies will be installled automatically.
To uninstall assmanager
, use:
pip uninstall assmanager
from assmanager import AssManager
am = AssManager()
am.run()
Running the above code will perform a data assimilation experiment with default parameters for the Lorenz 05 model.
First, you need to set experiment parameters. Assmanager supports setting experiment parameters using Python dictionaries or ini configuration files.
When using a dictionary, the structure passed to AssManager is as follows:
config = {
'model_params': {
'model_size': 960,
'forcing': 15.00,
'time_steps': 200 * 360,
...
},
'DA_params': {
'obs_density': 4,
'obs_freq_timestep': 50,
'obs_error_var': 1.0,
},
'DA_option': {
'save_prior_ensemble': False,
'save_prior_mean': True,
...
},
'Input_file_paths': {
'ics_path': 'data/ics_ms3_from_zt1year_sz3001.mat',
...
},
'IC_data': {
'ics_imem_beg': 248,
...
},
'Experiment_option': {
'verbose': True,
'experiment_name': 'test',
...
}
}
Instantiate AssManager with the config:
am = AssManager(config)
When using an ini configuration file, first edit the ini configuration file and then pass the ini file directory when instantiating AssManager:
config_path = './config.ini'
am = AssManager(config_path)
Initial conditions and observation data can be generated using the DataGenerator
class. The DataGenerator
class supports generating initial conditions and observation data in .npy
format.
from assmanager import DataGenerator
dg = DataGenerator({'forcing': 15})
dg.generate_ics(3001)
dg.generate_nr()
Running the above code will generate initial conditions and natural run data in your current directory. If you want to specify the save directory, you can pass the data_save_path
parameter to the generate_ics
and generate_nr
methods.
When no custom parameter configuration is provided, assmanager will use the default config.ini file with the following content:
[model_params]
model_size = 960 # N
forcing = 15.00 # F
space_time_scale = 10.00 # b
coupling = 3.00 # c
smooth_steps = 12 # I
K = 32
delta_t = 0.001
time_step_days = 0
time_step_seconds = 432
time_steps = 200 * 360 # 1 day~ 200 time steps
model_number = 3 # 2 scale
[DA_params]
model_size = 960
time_steps = 200 * 360 # 1 day~ 200 time steps
obs_density = 4
obs_freq_timestep = 50
obs_error_var = 1.0
[DA_config]
ensemble_size = 40
filter = EnKF
update_method = serial_update # serial_update, parallel_update
inflation_method = multiplicative # None, multiplicative
inflation_factor = 1.01
inflation_sequence = before_DA # before_DA, after_DA
localization_method = GC # None, GC, CNN
localization_radius = 240
[DA_option]
save_prior_ensemble = False
save_prior_mean = True
save_analysis_ensemble = False
save_analysis_mean = True
save_observation = False
save_truth = False
save_kalman_gain = False
save_prior_rmse = True
save_analysis_rmse = True
save_prior_spread_rmse = True
save_analysis_spread_rmse = True
file_save_option = single_file
[Input_file_paths]
ics_path = /data1/zyzhao/scratch/data/ics_ms3_from_zt1year_sz3001.mat # directory must exist
ics_key = zics_total1 # optionnal, if needed
obs_path = /data1/zyzhao/scratch/data/obs_ms3_err1_240s_6h_5y.mat # directory must exist
obs_key = zobs_total # optionnal, if needed
truth_path = /data1/zyzhao/scratch/data/zt_25year_ms3_6h.mat # directory must exist
truth_key = zens_times # optionnal, if needed
[IC_data]
ics_imem_beg = 248
[Experiment_option]
verbose = True
experiment_name = default
result_save_path = /data1/zyzhao/L05_experiments # must be valid path
data_save_path = data # data folder name, choose what u like
file_save_type = npy # npy, mat, dat
prior_ensemble_filename = zens_prior # prior ensemble file will be saved as zens_prior.npy
prior_mean_filename = prior # the same as above
analysis_ensemble_filename = zens_analy
analysis_mean_filename = analy
obs_filename = zobs
truth_filename = ztruth
kalman_gain_filename = kg
prior_rmse_filename = prior_rmse
analysis_rmse_filename = analy_rmse
prior_spread_rmse_filename = prior_spread_rmse
analysis_spread_rmse_filename = analy_spread_rmse
The parameter configuration is divided into seven major sections:
- model_params: Configuration of parameters for the Lorenz 05 model.
- DA_params: External parameter settings used for data assimilation, including the total number of integration steps, observation density, observation frequency, and more.
- DA_config: Internal parameter settings for data assimilation, including ensemble size, data assimilation algorithm, inflation, and localization settings.
- DA_option: Data saving switches to specify which data to save, including ensemble mean, Kalman gain, RMSE, and more.
- Input_file_paths: Configuration for reading external data, such as initial conditions, observation data, and ground truth data file directories.
- IC_data: Configuration for specifying which ensemble members are included in the initial conditions.
- Experiment_option: Experiment name, experiment save directory, data save format, and file name settings.
See assmanager/demo.py
for an example of running multiple experiments at once.
# demo.py
from assmanager import AssManager
inflation_values = [1.05]
inflation_sequences = ['before_DA']
ensemble_size = 2000
forcings = [16, 15]
time_steps = 200 * 360 * 5
# time_steps = 200 * 20
configs = []
for inf in inflation_values:
for seq in inflation_sequences:
for forcing in forcings:
configs.append(
{
'model_params': {
'forcing': forcing,
'time_steps': time_steps,
},
'DA_params': {
'time_steps': time_steps,
},
'DA_config': {
'ensemble_size': ensemble_size,
'inflation_factor': inf,
'inflation_sequence': seq,
'filter': 'EnKF',
},
'DA_option': {
'save_kalman_gain': True,
'save_prior_ensemble': True,
'save_analysis_ensemble': True,
'file_save_option': 'multiple_files',
# 'file_save_option': 'single_file',
},
'Experiment_option': {
'experiment_name': f'EnKF_F{forcing}_inf_{inf}_{seq}_sz{ensemble_size}_5y_cpptest',
'result_save_path': '/mnt/pve_nfs/zyzhao/L05_experiments',
}
}
)
ams = [AssManager(config) for config in configs]
for am in ams:
am.run()
This framework supports the extension of data assimilation (DA) algorithms, inflation methods, and localization methods.
To add a new inflation method, open filter/ensembleFilter.py
, find the inflation
function within the ensembleFilter
class, and add an additional if
statement for your new method:
# filter/ensembleFilter.py
def inflation(self, zens: np.mat) -> np.mat:
""" inflation
Args:
zens (np.mat): state ensemble
zens_prior (np.mat): prior state ensemble
Returns:
np.mat: inflated state ensemble
"""
if self.inflation_method is None:
return zens
elif self.inflation_method == 'multiplicative':
ens_mean = np.mean(zens, axis=0)
ens_prime = zens - ens_mean
zens_inf = ens_mean + self.inflation_factor * ens_prime
return zens_inf
elif self.inflation_method == 'your inflation method':
# some calculations here
# if you need previous prior and analysis mean, you can access them by self.prior_mean and self.analy_mean
zens_inf = ...
return zens_inf
To add a new localization method, open filter/ensembleFilter.py
, find the construct_GC_2d function
, create a new function to compute the localization matrix ρ, and modify the __get_localization_matrix
method in the ensembleFilter
class to use your new localization method.
To add a new DA method, copy the EnKF.py
file in the filter/ directory and rename it to your_DA_method.py
. Modify the class name to yourDAMethod
, and then make changes to the __serial_update
and __parallel_update
methods.
- Zhiyu Zhao
- zyzh@smail.nju.edu.cn
- qq: 605601949
Zhongrui Wang provided the L05 Python code with Numba acceleration and the DA program.