Neural network architecture that is fully equivariant with respect to transformations under the Lorentz group, a fundamental symmetry of space and time in physics.
This repository holds the software and technical information for a new neural network architecture design based on Lorentz Group Equivariance. The usage and performance of this network is deployed and demonstrated in context of high energy hadronic jet physics.
The code in this repository can be broken down in three categories, corresponding with different tasks: dataset conversion, network training, and plotting kinematics & network results. Below we list the code dependencies (global, and task-specific).
- Python3 libraries: h5py, numba, numpy, pandas
- HTCondor
- optional, highly recommended (for parallelizing conversion tasks)
- ROOT (& PyROOT)
- optional
- Python3 libraries: cudatoolkit, h5py, PyTorch, torchvision
- easy setup via conda (see further down)
- Python3 libraries: h5py, matplotlib, numpy
- ROOT (& PyROOT)
The easiest way is to install in top of a conda environment, via pip. LGN requires Python 3, PyTorch 1.2, CUDA 10, and a few more small packages. All these should be installed automatically when you run setup.py
LGN is installable using pip. You can currently install it from source by going to the directory with setup.py::
pip install .
If you would like to modify the source code directly, note that LGN can also be installed in "development mode" using the command::
pip install -e .
In order to explain how to train the network, we will focus on the example of performing top-tagging using the reference dataset here -- this is the dataset used in the summary paper "The Machine Learning Landscape of Top Taggers" by G. Kasieczka et. al.
First, we need to convert the dataset files into a format that our network's data-loading utilities (and our plotting utilities) understand. This is especially important since we may want to use different datasets from different sources, each with their own formats & organizations -- converting each to a single format avoids the need for multiple PyTorch Dataset
classes and duplicate plotting scripts.
In the case of the top-tagging dataset, this conversion is very lightweight: The dataset, like the format our network uses, stores jet constituents as lists of momentum 4-vectors in Cartesian format (E, px, py, pz)
, where the z-axis corresponds with the beamline. All we need to do is copy the data from a pandas DataFrame
(saved in an HDF5 file) to a new HDF5 file written using h5py.
To accomplish this, we will use the script at /data/toptag/conversion/raw2h5/convert.py
. This script makes use of HTCondor to parallelize the conversion process and speed things up -- it can also be run without HTCondor, in which case the conversion process will not be parallelized. The script can be run as follows:
python3 convert.py /path/to/dir/with/data/files njobs
Here, the first argument is the path to the directory containing the unconverted top-tagging files. The second argument (njobs
) is the number of jobs to submit to HTCondor. If not using Condor, this should be set to -1
.
Note: When training the network, loading the data will be handled by the script at /src/lgn/data/utils.py
. This handles organizing the input data files into training, testing and validation samples. These are determined by matching the patterns train*.h5
, test*.h5
, and val*.h5
, respectively. One must have at least one file matching each of these patterns (otherwise one of these samples will be missing!), and files in the data directory you specify that do not match any of these patterns will be ignored.
With the data files ready to be read into the network, it's time for training! Some of steps are specific to logging into one of our clusters, Lambda
, but the instructions are general to all machines capable of setting up conda & using CUDA. For completeness, here is an outline of the instructions:
-
Clone the git repository to the machine where the network will be trained.
cd [project folder] git clone git@github.com:fizisist/LorentzGroupNetwork.git
-
Create a conda environment for this project, and install pytorch, torchvision and cudatoolkit:
conda create -n pt python=3.7 anaconda conda activate pt conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
The version of cudatoolkit may depend on the GPU's being used.
-
Install LGN
cd [project folder]/LorentzGroupNetwork/ pip install -e .
-
Check which GPU's are available, and select one to use for training.
nvidia-smi export CUDA_VISIBLE_DEVICES=[device_id]
Training is currently not parallelized across GPU's.
-
Train!
python3 scripts/train_lgn.py
This last script can be passed a wide range of arguments, corresponding with hyperparameters & network configurations. For example, one may consider the following:
scripts/train_lgn.py --datadir=/path/to/dir/with/converted/data/files --maxdim=3 --max-zf=1 --num-channels 2 4 4 2 --num-epoch=10 --batch-size=8 --num-cg-levels=3 --lr-init=0.001 --lr-final=0.00001 --mlp=True --pmu-in=True --nobj=126 --prefix=my_lgn_config --verbose=0
datadir
: Directory containing the converted top-tagging files.maxdim
: Maximum dimensionality of tensors produced in the network.max-zf
: Maximum degree of zonal functions used in tensor decompositions.num-channels
: Number of channels per layer.num-epoch
: Number of training epochs.batch-size
: Mini-batch size.num-cg-levels
: Number of Clebsch-Gordan layers. If this is smaller thannum-channels
, the extra layers at the end will be standard multi-layer perceptrons (MLP's) acting on any Lorentz-invariants produced.lr-init
: Initial learning rate.lr-final
: Final learning rate.mlp
: Whether or not to insert MLP's acting on Lorentz-invariant scalars within the CG layers.pmu-in
: Whether or not to feed in 4-momenta themselves to the first CG layer, in addition to scalars.nobj
: Max number of jet constituents to use for entry. Constituents are ordered by decreasingpT
, so the network uses thenobj
leading constituents.
For a full list of possible arguments for
train_lgn.py
, see/NetworkDesign/src/lgn/engine/args.py
.
We can plot network diagnostics:
- accuracy
- area under the ROC curve
- loss
- signal efficiency at 30% background rejection
using the script at /Figures/scripts/perf_plot.py
.
[1] A. Bogatskiy, B. Anderson, J. T. Offermann, M. Roussi, D. W. Miller, R. Kondor, Lorentz Group Equivariant Neural Network for Particle Physics, ICML 2020 (accepted).