Train MLP-based surrogate models for LSS power spectra.
In order to generate training data, you need CobayaLSS
(https://github.com/martinjameswhite/CobayaLSS). In particular, you need to use the provider
branch.
All other requirements are contained in requirements.txt
Any Cobaya
config file can be converted to generate training data. This can be done by adding an emulate
section at the end of the config (see, e.g. configs/lrg_x_planck_aemulus_20xfast_rs_spectra_1e6pts_training_data.yaml
).
The priors in the config specify the parameter space range over which training data will be generated.
The keywords under emulate
in your config file control how much training data to generate, and where to write the file. In particular, output_filename
specifies where the training data is written. nend
is the total number of training points (fast and slow) to generate. You can specify fast parameters in param_names_fast
and the factor by which you want to oversample fast parameters in nfast_per_slow
.
Run generate_training_data.sh
, replacing the config file with your modified version. Note that training data can (and probably should) be generated in parallel via MPI (see an example in generate_training_data.sh
).
See train_emu.sh
for an example of how to train the emulators. You'll need to modify configs/train_p0_scan_4_128_1.yaml
to point to your new training data file (the training_filename
keyword).
This will produce one JSON file per surrogate model. These can be read in with the emulator class in emulator.py
in order to make predictions using the emulator. The order that parameters must be provided to the Emulator
class is the same as the order that the Cobaya model used for training assumed.
See notebooks/train_surrogates.ipynb
for a toy example of this.
We make available a number of pre-trained models in nn_weights
:
Halofit P_mm : nn_weights/lrg_x_planck_cleft_priors_buzzard_shape_halofit_pmm_20xfast_rs_spectra_1e6pts_training_data_v1_pmm_emu.json
CLEFT P_gm : nn_weights/lrg_x_planck_cleft_priors_buzzard_shape_20xfast_rs_spectra_1e6pts_training_data_v1_pgm_emu.json
CLEFT P_gg : nn_weights/lrg_x_planck_cleft_priors_buzzard_shape_20xfast_rs_spectra_1e6pts_training_data_v1_pgg_emu.json
HEFT (anzu) P_mm : nn_weights/lrg_x_planck_aemulus_priors_20xfast_rs_spectra_1e6pts_training_data_v1_pmm_emu.json
HEFT (anzu) P_gm : nn_weights/lrg_x_planck_aemulus_priors_20xfast_rs_spectra_1e6pts_training_data_v1_pgm_emu.json
HEFT (anzu) P_gg : nn_weights/lrg_x_planck_aemulus_priors_20xfast_rs_spectra_1e6pts_training_data_v1_pgg_emu.json
Lagrangian EFT P_0 : nn_weights/ptchallenge_cmass2_20xfast_1e6pts_training_data_v2_p0_emu.json
Lagrangian EFT P_2 : nn_weights/ptchallenge_cmass2_20xfast_1e6pts_training_data_v2_p2_emu.json
Lagrangian EFT P_4 : nn_weights/ptchallenge_cmass2_20xfast_1e6pts_training_data_v2_p4_emu.json
See notebooks/pretrained_models.ipynb
for details on how to load and call these models.