/EPiC-FM

EPiC Flow Matching Implementation for Generating Jets as Point Clouds (https://arxiv.org/abs/2310.00049)

Primary LanguagePythonMIT LicenseMIT

EPiC Flow Matching

python pytorch lightning hydra black isort
Template Paper Conference

Description

This is the official repository implementing the EPiC Flow Matching point cloud generative machine learning models from the paper 'EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion'. A library that includes this model as well as additional losses, architectures and datasets can be found here.

EPiC Flow Matching is a Continuous Normalising Flow that is trained with a simulation free approach called Flow Matching. The model uses DeepSet based EPiC layers for the architecture, which allow for good scalability to high set sizes.

The models are tested on the JetNet dataset. The JetNet dataset is used in particle physics to test point cloud generative deep learning architectures. It consists of simulated particle jets produced by proton proton collisions in a simplified detector. The dataset is split into jets originating from tops, light quarks, gluons, W bosons and Z bosons and has a maximum number of 150 particles per jet.

This repository uses pytorch lightning, hydra for model configurations and supports logging with comet and wandb. For a deeper explanation of how to use this repository, please have a look at the template directly.

How to run

Install dependencies

# clone project
git clone https://github.com/YourGithubName/your-repo-name
cd your-repo-name

# [OPTIONAL] create conda environment
conda create -n myenv python=3.10
conda activate myenv

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Create .env file to set paths and API keys

PROJEKT_ROOT="/folder/folder/"
DATA_DIR="/folder/folder/"
LOG_DIR="/folder/folder/"
COMET_API_TOKEN="XXXXXXXXXX"

Train model with default configuration

# train on CPU
python src/train.py trainer=cpu

# train on GPU
python src/train.py trainer=gpu

Train model with chosen experiment configuration from configs/experiment/

python src/train.py experiment=experiment_name.yaml

You can override any parameter from command line like this

python src/train.py trainer.max_epochs=20 data.batch_size=64

The experiments include

fm_tops30_cond EPiC Flow Matching trained on top30 dataset with conditioning on jet mass and pt
fm_tops30 EPiC Flow Matching trained on top30 dataset with no additional conditioning. Jet size conditioning is a neccessity for the architecture
fm_tops150_cond EPiC Flow Matching trained on top150 dataset with conditioning on jet mass and pt
fm_tops150 EPiC Flow Matching trained on top150 dataset with no additional conditioning. Jet size conditioning is a neccessity for the architecture

During training and evaluation, metrics and plots can be logged via comet and wandb. After training the model will be evaluated automatically and the final results will be saved locally and logged via the selected loggers. The evaluation can also be manually started like this

python src/eval.py experiment=experiment_name.yaml ckpt_path=checkpoint_path

Notebooks are available to quickly train, evaluate models and create plots.