MLaaS for HEP is a set of Python based modules to read HEP ROOT files and stream them to ML of user choice for training. It consists of three independent layers:
- data streaming layer to handle remote data, see reader.py
- data training layer to train ML model for given HEP data, see workflow.py
- data inference layer, see tfaas_client.py
The general architecture of MLaaS4HEP looks like this:
The pre-trained models can be easily uploaded to TFaas server for serving them to clients.
MLaaS4HEP python repository provides two base modules to read and manipulate with
HEP ROOT files. The reader.py
module defines a DataReader class which is
able to read either local or remote ROOT files (via xrootd). And, workflow.py
module provide a basic DataGenerator class which can be used with any ML
framework to read HEP ROOT data in chunks. Both modules are based on
uproot framework.
Basic usage
# get help and option description
./reader.py --help
# here is a concrete example of reading local ROOT file:
./reader.py --fin=/opt/cms/data/Tau_Run2017F-31Mar2018-v1_NANOAOD.root --info --verbose=1 --nevts=2000
# here is an example of reading remote ROOT file:
./reader.py --fin=root://cms-xrd-global.cern.ch//store/data/Run2017F/Tau/NANOAOD/31Mar2018-v1/20000/6C6F7EAE-7880-E811-82C1-008CFA165F28.root --verbose=1 --nevts=2000 --info
# both of aforementioned commands produce the following output
First pass: 2000 events, 35.4363200665 sec, shape (2316,) 648 branches: flat 232 jagged
VMEM used: 960.479232 (MB) SWAP used: 0.0 (MB)
Number of events : 1131872
# flat branches : 648
... # followed by a long list of ROOT branches found along with their dimentionality
TrigObj_pt values in [5.03515625, 1999.75] range, dim=21
More examples about using uproot may be found here and here
The HEP data are presented in ROOT data-format. The DataReader class provides access to ROOT files and various APIs to access the HEP data.
A simple workflow example can be found in workflow.py code. It contains two examples, one for PyTorch and another for Keras. It contains two examples (on for PyTorch and another for TF in Keras) and show full HEP ML workflow, i.e. it can read remote files and perform the training of ML models with HEP ROOT files.
If you clone the repo and setup your PYTHONPATH you should be able to run it as simple as
./workflow.py --help
# run the code with list of LFNs from files.txt
./workflow.py --files=files.txt
# run pytorch example
./workflow.py --files=files.txt --model=ex_pytorch.py
# run keras example
./workflow.py --files=files.txt --model=ex_keras.py
# cat files.txt
#dasgoclient -query="file dataset=/Tau/Run2018C-14Sep2018_ver3-v1/NANOAOD"
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/069A01AD-A9D0-7C4E-8940-FA5990EDFFCE.root
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/577AF166-478C-1F40-8E10-044AA4BC0576.root
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/9A661A77-58AC-0245-A442-8093D48A6551.root
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/C226A004-077B-7E41-AFB3-6AFB38D1A63B.root
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/D1E05C97-DB14-3941-86E8-C510D602C0B9.root
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/6FA4CC7C-8982-DE4C-BEED-C90413312B35.root
/store/data/Run2018C/Tau/NANOAOD/14Sep2018_ver3-v1/60000/282E0083-6B41-1F42-B665-973DF8805DE3.root
The workflow.py
relies on two JSON files, one which contains parameters for
reading ROOT files and another with specification of ROOT branches. The later
will be generated by reading ROOT file itself.
We provided full code called hep_resnet.py
as a basic model based on
ResNet implementation.
It can classify images from HEP events, e.g.
hep_resnet.py --fdir=/path/hep_images --flabels=labels.csv --epochs=200 --mdir=models
Here we supply input directory /path/hep_images
which contains HEP images
in train
folder along with labels.csv
file which provides labels.
The model runs for 200 epochs and save Keras/TF model into models
output
directory.
We provide pure python client to perform all necessary action against TFaaS server. Here is short description of available APIs:
# setup url to point to your TFaaS server
url=http://localhost:8083
# create upload json file, which should include
# fully qualified model file name
# fully qualified labels file name
# model name you want to assign to your model file
# fully qualified parameters json file name
# For example, here is a sample of upload json file
{
"model": "/path/model_0228.pb",
"labels": "/path/labels.txt",
"name": "model_name",
"params":"/path/params.json"
}
# upload given model to the server
tfaas_client.py --url=$url --upload=upload.json
# list existing models in TFaaS server
tfaas_client.py --url=$url --models
# delete given model in TFaaS server
tfaas_client.py --url=$url --delete=model_name
# prepare input json file for querying model predictions
# here is an example of such file
{"keys":["attribute1", "attribute2"], values: [1.0, -2.0]}
# get predictions from TFaaS server
tfaas_client.py --url=$url --predict=input.json
# get image predictions from TFaaS server
# here we refer to uploaded on TFaaS ImageModel model
tfaas_client.py --url=$url --image=/path/file.png --model=ImageModel
Please use this publication for further citation: http://arxiv.org/abs/1811.04492