This is the official repository implementing the EPiC Flow Matching point cloud generative machine learning models from the paper 'EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion'. A library that includes this model as well as additional losses, architectures and datasets can be found here.
EPiC Flow Matching is a Continuous Normalising Flow that is trained with a simulation free approach called Flow Matching. The model uses DeepSet based EPiC layers for the architecture, which allow for good scalability to high set sizes.
The models are tested on the JetNet dataset. The JetNet dataset is used in particle physics to test point cloud generative deep learning architectures. It consists of simulated particle jets produced by proton proton collisions in a simplified detector. The dataset is split into jets originating from tops, light quarks, gluons, W bosons and Z bosons and has a maximum number of 150 particles per jet.
This repository uses pytorch lightning, hydra for model configurations and supports logging with comet and wandb. For a deeper explanation of how to use this repository, please have a look at the template directly.
Install dependencies
# clone project
git clone https://github.com/YourGithubName/your-repo-name
cd your-repo-name
# [OPTIONAL] create conda environment
conda create -n myenv python=3.10
conda activate myenv
# install pytorch according to instructions
# https://pytorch.org/get-started/
# install requirements
pip install -r requirements.txt
Create .env file to set paths and API keys
PROJEKT_ROOT="/folder/folder/"
DATA_DIR="/folder/folder/"
LOG_DIR="/folder/folder/"
COMET_API_TOKEN="XXXXXXXXXX"
Train model with default configuration
# train on CPU
python src/train.py trainer=cpu
# train on GPU
python src/train.py trainer=gpu
Train model with chosen experiment configuration from configs/experiment/
python src/train.py experiment=experiment_name.yaml
You can override any parameter from command line like this
python src/train.py trainer.max_epochs=20 data.batch_size=64
The experiments include
fm_tops30_cond
EPiC Flow Matching trained on top30 dataset with conditioning on jet mass and ptfm_tops30
EPiC Flow Matching trained on top30 dataset with no additional conditioning. Jet size conditioning is a neccessity for the architecturefm_tops150_cond
EPiC Flow Matching trained on top150 dataset with conditioning on jet mass and ptfm_tops150
EPiC Flow Matching trained on top150 dataset with no additional conditioning. Jet size conditioning is a neccessity for the architectureDuring training and evaluation, metrics and plots can be logged via comet and wandb. After training the model will be evaluated automatically and the final results will be saved locally and logged via the selected loggers. The evaluation can also be manually started like this
python src/eval.py experiment=experiment_name.yaml ckpt_path=checkpoint_path
Notebooks are available to quickly train, evaluate models and create plots.