AmpWrapper is designed to consolidate the tools provided by the AmpTools library into a few easy-to-use Python scripts. Among these are scripts for running multiple fits through a SLURM interface, generating configuration files for partial wave analysis in both mass-dependent and model-independent fits, plotting results from fits, and bootstrapping fit results, along with utilities which wrap AmpTools fit results files. This code was designed to be used on the CMU MEG cluster and some of the job-submission code is formatted for our specific queue structure.
In general, scripts have the naming format amptools-<script name>
to allow for them to be easily selected through tab completion.
usage: amptools-activate [-h] [-d DIRECTORY]
optional arguments:
-h, --help show this help message and exit
-d DIRECTORY, --directory DIRECTORY
(optional) path to new environment directory (default
is current directory)
- Activates an AmpWrapper environment in the specified
DIRECTORY
(uses current working directory if none is supplied). This involves creating folders for flattrees and an.env.json
file which is populated with useful information for other scripts. It also creates a file~/.amptoolstools
which just points all programs to the currently activated environment (this allows scripts to be run from anywhere without having to be in the environment directory).
usage: amptools-convert [-h] [--version] [--merge-only]
[--exclude EXCLUDE [EXCLUDE ...]] [-w WEIGHT]
[-f FORMAT [FORMAT ...]] [-p PREFIX] [--force]
[--no-pol]
input output
Convert ROOT analysis trees to AmpTools flat trees
positional arguments:
input Directory containing either:
1. ROOT files with run numbers in their names
2. ROOT files with any of the following keywords
in their names (this will skip the merging step!)
3. Directories whose names contain the following
keywords and whose contents are ROOT files
Keywords: AMO, PARA_0, PERP_45, PERP_90, PARA_135
In the case of input directories which contain a mix of the
above formats, the program will chose a method in the given order.
output Directory to store any output files
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--merge-only Merge files by polarization without conversion
--exclude EXCLUDE [EXCLUDE ...]
Exclude polarizations from merging/conversion
e.g. "--exclude AMO PERP_45 PERP_90" to only
process PARA polarizations
-w WEIGHT, --weight WEIGHT
Specify a weighting factor (positive, even
for background trees)
Default: 1.0
If a path is supplied, weights will be pulled from CSVs with columns of
| EventNumber | ComboNumber | Signal Weight | ... |
This file must have a header line with at least these column names and no
index column. The filename must contain the run number.
-f FORMAT [FORMAT ...]
Specify format for final state (run without
this option for more information)
-p PREFIX, --prefix PREFIX
Specify output prefix
--force Force recreation of new merged ROOT files if they already exist
--no-pol Don't include energy-dependent polarization information in the beam 4-momentum
--no-accidental-subtraction
Skip accidental weighting
- converts GlueX analysis TTrees into AmpTools flattree format after merging them according to polarization
- a flat weight can be specified by
-w
. Alternatively, a directory can be supplied containing multiple CSVs whose file names contain a run number and whose header is EventNumber,ComboNumber,Signal Weight,...- For example:
some_weighting/some_weights_030499.csv
:
- For example:
EventNumber | ComboNumber | Signal Weight |
---|---|---|
493208 | 0 | 0.5432 |
495834 | 1 | 0.9982 |
... | ... | ... |
would be accessed by -w some_weighting
.
- The
-f FORMAT
option can be used when you already know the particles which you want in the final state. Running the script without this argument will give you a dialog to specify the particles and a string of numbers which can be used in future script calls to skip the dialog step.
usage: amptools-link [-h] [-d | -b | -g | -a] [-s | -c] [-f]
ROOT file [ROOT file ...]
positional arguments:
ROOT file file(s) to add
optional arguments:
-h, --help show this help message and exit
-d, --data add as data (default)
-b, --bkg add as background
-g, --gen add as generated/thrown MC
-a, --acc add as accepted/reconstructed MC
-s, --soft soft symlink to file (default is hard symlink)
-c, --copy copy file rather than symlink
-f, --force force overwrite of file if it already exists in the environment
- This script adds flattrees to the proper folders in an AmpWrapper environment. By default, these files are hard linked, which means a label is added to the file which points to the directory in the AmpWrapper environment. Deleting the file in that directory does not delete it in the original location and vice-versa, but it only takes up one spot in disk memory. Alternatively a
-s
can be used to create a soft symlink which just points to the original location on disk, but this link is broken if the original file is deleted or moved.-c
can alternatively be used to copy the file, but this will take up two different locations on disk (although modifying files in one location will not effect the others, unlike in a hard symlink).
usage: amptools-study [-h] [-d DATA [DATA ...]] [-g GEN [GEN ...]]
[-a ACC [ACC ...]] [-b BKG [BKG ...]] [--use-background]
[-n NBINS] [--low LOW] [--high HIGH]
name
positional arguments:
name study name
optional arguments:
-h, --help show this help message and exit
-d DATA [DATA ...], --data DATA [DATA ...]
path(s) to data files (optional)
-g GEN [GEN ...], --gen GEN [GEN ...]
path(s) to generated MC files (optional)
-a ACC [ACC ...], --acc ACC [ACC ...]
path(s) to accepted MC files (optional)
-b BKG [BKG ...], --bkg BKG [BKG ...]
path(s) to background files (optional)
--use-background prompt for file selector if -b/--bkg option is left
blank
-n NBINS, --nbins NBINS
number of bins (set to 1 for an unbinned study)
--low LOW lower edge for data selection
--high HIGH lower edge for data selection
- The only required argument is a name for the study. If none of the file paths are provided, a dialog will allow the user to select files which have been
amptools-link
ed into the environment directory. - Additionally, if no binning information is provided, a command line interface will load weighted data files and display a histogram with binning that can be modified by user input keys. This is helpful if you don't exactly know what binning you want to use and don't want to create a bunch of plots with static histograms.
usage: amptools-generate [-h] [-b] [-o OUTPUT] [--amo] [-n NAME]
[--add-pol-info] [--sym]
amplitudes [amplitudes ...]
positional arguments:
amplitudes list of amplitudes to include, formatted as L/M/R
optional arguments:
-h, --help show this help message and exit
-b, --background use a separate set of background files
-o OUTPUT, --output OUTPUT
name for output configuration file
--amo include amorphous runs
-n NAME, --name NAME reaction name
--pol-files use separate files for each polarization
--add-pol-info polarization information is NOT included in the ROOT
file's beam photon four-vector
--sym symmetrize the two (non-recoil) particles (use this
for identical particles)
- This script is used to generate AmpTools configuration files with proper bookkeeping on all the amplitudes (specifically for Zlm amplitudes).
amplitudes
is a list of strings of the formatL/M/R
(L = total spin = {0, 1, 2, ...}, M = orbital quantum number = {-L, ..., +L}, R = reflectivity = {-1, +1} or {+, -}). Excluding one of these numbers generates all possibilities (except forL
of course), so1//
generates P waves with M = -1, 0, +1 and both positive and negative reflectivity (six waves total).- Currently, it must at least generate Zlm amplitudes, but other amplitudes can be tacked on to those Zlms using @tags:
2/0/1@my_Dwave_1 2/0/1@my_Dwave_2
would create two positive-reflectivity D waves with M = 0 which are distinct amplitudes in the config file. By itself, this would not make much sense, but it can be extended withL/M/R@<name>@<Amplitude>@args@args@...
to additionally create named amplitudes which are not Zlms.2/1/1@my_f21270@BreitWigner@1.2755@0.1867@2@2@3
would create a positive-reflectivity D wave with M = +1 as well as a Breit-Wigner amplitude with a mass of 1.2755 MeV and a width of 0.1867 MeV (see the specific amplitude for the required arguments)
- If no
-o OUTPUT
option is included, the file will be printed to the terminal rather than saved in theconfigs
directory inside the AmpWrapper environment.
usage: amptools-generate-from-json [-h] [-n NAME] [-o OUTPUT] [--amo] [--sym]
[--add-pol-info] [-b]
input
positional arguments:
input input JSON file to turn into amptools-generate command
optional arguments:
-h, --help show this help message and exit
-n NAME, --name NAME reaction name
-o OUTPUT, --output OUTPUT
name for output configuration file
--amo include amorphous runs
--sym symmetrize the two (non-recoil) particles (use this
for identical particles)
--pol-files use separate files for each polarization
--add-pol-info polarization information is NOT included in the ROOT
file's beam photon four-vector
-b, --background use a separate set of background files
- This script has mostly the same arguments as
amptools-generate
but takes aJSON
-formatted file as input rather than a list of amplitude strings. An exampleJSON
file is provided in the repository (amplitudes.json
). - Running the script without
-o OUTPUT
will print a valid command foramptools-generate
to the terminal followed by--output
, allowing the user to copy-paste the command and add a name to generate the configuration manually.
usage: amptools-fit [-h] [-s STUDY] [-c CONFIG] [-i ITERATIONS] [-a]
[--seed SEED] [--skip-fit]
[-q {red,green,blue}] [--no-mem]
optional arguments:
-h, --help show this help message and exit
-s STUDY name of AmpTools study to fit
-c CONFIG name of AmpTools config to use in fit
-i ITERATIONS, --iterations ITERATIONS
number of fits to do for each bin (randomized
bootstrap replication)
-a, --append append these iterations to any existing fits rather
than rerunning
--seed SEED seed for randomization
--skip-fit skip fitting and just collect available results from
any previous fits
-q {red,green,blue}, --queue {red,green,blue}
SLURM queue for jobs
--no-mem don't set a memory cap in the SLURM script (use for
large bins where the generated MC is a huge file)
- This script actually runs the
fit
command provided byhalld_sim
. The study and configuration names are optional and a dialog will allow the user to select them if they aren't provided.
- These scripts all share similar functionality to
amptools-fit
but slightly modify the randomization process. Whileamptools-fit
starts all amplitudes in a random spot in parameter space,amptools-fit-chain
fits the first bin (the lowest mass bin) a specified number of times in random starting locations, selects the fit with the best likelihood, and starts each subsequent bin fit from the minimized value of the previous one. This significantly reduces the amount of fits which are done, but it can be unstable if the first bin isn't a great minimum or if the fit ends up on the wrong branch of minima somewhere along the fit. amptools-fit-bootstrap
must be run after runningamptools-fit
oramptools-fit-chain
, as it takes the best likelihood fit in each bin and then runs a specified number of fits starting at that minimum with a bootstrapped dataset.amptools-fit-stability
is a different way of selectingg the best minimum. Rather than relying on the best likelihood, each individual iteration within each bin is bootstrapped and the best iteration is selected based on the bootstrap-t, which is related to the distance of the fit from the mean of the bootstraps normalized by the variance of the bootstraps. Fits with the lowest distance are selected because they represent a minimum which won't change much if the data is modified or if new data is obtained.
usage: amptools-plot [-h] [-s STUDY]
optional arguments:
-h, --help show this help message and exit
-s STUDY, --study STUDY
study name
- All of these scripts generate plots for their respective fitting scripts. A dialog will allow the user to select the study along with the configuration that was fit. The script generates plots for the amplitudes of each wave, phase plots, complex amplitude plots (for some scripts), and violin plots that show the distribution of fits in each bin (for some scripts). The bootstrap version also does a simple bias-correction calculation (WIP).
amptools-plot-angles
creates plots for the angular distributions of particles in each bin for each type of data (accepted MC, generated MC, acceptance-corrected data) and doesn't require a fit to be run first.
- These scripts are used to select a specific thrown topology based on the particles you want in your final state. Generators like
gen_amp
can create unwanted decays which are difficult to deal with in theamptools-convert
script, so it is useful to only select one topology at a time in an AmpTools analysis.
- This script is still a work in progress. The intent is for it to display useful information about a particular study or configuration.
- A simple script to boost flattrees to the center-of-momentum frame. This is largely not required because AmpTools already does this by default, but might be useful for other testing purposes.
- This file is not a script, but it contains most of the helper functions used by the rest of the scripts.
Here I demonstrate a typical workflow starting from some processed data/MC GlueX analysis trees located at ~/data_trees/
, ~/recon_trees/
(reconstructed/accepted MC), and ~/gen_trees/
(thrown/generated MC). I exclude most of the output/all dialogs from the programs:
$ tree ~/data_trees/
/home/<username>/data_trees/
├── tree_ksks__B4_050685.root
├── tree_ksks__B4_050697.root
├── tree_ksks__B4_050698.root
...
└── tree_ksks__B4_051768.root
$ tree ~/gen_trees/
/home/<username>/gen_trees/
├── tree_ksks__B4_PARA_0.root
├── tree_ksks__B4_PARA_135.root
├── tree_ksks__B4_PERP_45.root
└── tree_ksks__B4_PERP_90.root
$ tree ~/recon_trees/
/home/<username>/recon_trees/
├── tree_ksks__B4_PARA_0.root
├── tree_ksks__B4_PARA_135.root
├── tree_ksks__B4_PERP_45.root
└── tree_ksks__B4_PERP_90.root
First, we convert the GlueX analysis trees to AmpTools flattrees:
$ amptools-convert -p data --exclude AMO ~/data_trees/ ~/data_merged/
$ amptools-convert -p gen ~/gen_trees/ ~/gen_merged/
$ amptools-convert -p acc ~/recon_trees/ ~/recon_merged/
Next, activate a new AmpWrapper environment and link the flattrees:
$ amptools-activate ~/pwa_analysis/
$ amptools-link ~/data_merged/flattrees/*
$ amptools-link --gen ~/gen_merged/flattrees/*
$ amptools-link --acc ~/recon_merged/flattrees/*
Next, we will make a configuration file. For this demonstration, we'll just add all possible S and D waves (probably overkill but we could use the results of such a fit to decide which waves to get rid of). Because is is a channel with two identical kaons in the final state, we should symmetrize them in the fit:
$ amptools-generate --sym -n ksks_fit -o all_waves 0// 2//
Next, we create a study. Our study here will span the 1-2 GeV range, so 25 bins will give us a bin-width of 40 MeV:
$ amptools-study -n 25 --low 1.000 --high 2.000 my_study
Now let's fit the data with the configuration we just made, creating 30 randomly initialized fits in each bin, and submit it to the red
SLURM queue:
$ amptools-fit -s my_study -c all_waves -i 30 -q red
Next, let's generate a plot:
$ amptools-plot -s my_study
This will create a PDF at ~/pwa_analysis/my_study/plot_all_waves_results.pdf
. Suppose we like these results and want to get a better interpretation of the error/bias through bootstrapping. We then do the following:
$ amptools-fit-bootstrap -s my_study -c all_waves -i 50 -q red
$ amptools-plot-bootstrap -s my_study
This will generate a PDF at ~/pwa_analysis/my_study/plot_all_waves_results_bootstrap.pdf
.
The following code will install everything in one line:
$ python -m pip install git+https://github.com/denehoffman/AmpToolsTools.git
If you want to have an easily-accessible cloned version of the source, you can also do
$ git clone https://github.com/denehoffman/AmpToolsTools.git
$ cd AmpToolsTools
$ python -m pip install .
or
$ python -m pip install -e .
to soft symlink the repo (so that you can modify it without having to reinstall each time to test it).
- Possibly a wrapper for more AmpPlotter functionality (right now the plot scripts will only really work for mass/model-independent fits, although you can still create studies with one bin and fit mass-dependent models).
- Clean up code, help strings, and documentation