For this tutorial we will use CMSSW_13_3_2 on lxplus 9. To set this up, run the following commands:
# login to lxplus, we assume bash as shell
# i.e., echo $SHELL should print out '/bin/bash'
cmsrel CMSSW_13_3_2
cd CMSSW_13_3_2/src
cmsenv
# Clone this repository into the src directory:
git clone git@github.com:Dominic-Stafford/CMS_Herwig_tutorial_2024.git
# Build cmssw so that it can find the gen fragments:
scram b -j 4
In this repository we provide a number of generator fragments to run the different examples. A fragment is a part of a configuration containing only the information specific to the physics process. To convert this into a full configuration one needs to use the cmsDriver command, which adds details of output formats and the conditions for the run:
cmsDriver.py CMS_Herwig_tutorial_2024/standalone/python/DYToLL_TuneCH3_13TeV_herwig7_cff.py --python_filename DYToLL_TuneCH3_13TeV_herwig7_cfg.py --eventcontent RAWSIM,NANOAODGEN --datatier GEN,NANOAOD --fileout file:standalone_DY.root --conditions 106X_upgrade2018_realistic_v4 --beamspot Realistic25ns13TeVEarly2018Collision --step GEN,NANOGEN --geometry DB:Extended --era Run2_2018 --no_exec --mc --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" -n 5000
Move the completed fragment into an appropriate test directory, and run it using the "cmsRun" command:
mkdir -p test/standalone
cd test/standalone
cp ../../DYToLL_TuneCH3_13TeV_herwig7_cfg.py .
cmsRun DYToLL_TuneCH3_13TeV_herwig7_cfg.py
This should produce a large number of files:
- HerwigConfig.in - Herwig input file produced by CMSSW
- standalone_DY.run - non-human readable run card produced by Herwig
- standalone_DY-S123456790.log - Detailed Herwig log (includes full event record for a few events)
- standalone_DY-S123456790.out - Herwig process summary (including xs)
- standalone_DY-S123456790.tex - Herwig credits
- standalone_DY.root - CMS GEN data-tier ROOT file for further processing
- standalone_DY_inNANOAODGEN.root - Events in NanoAOD-like format
- standalone_DY.yoda - Rivet histograms
Try having a look in the .in, .log and .out text files, as well as opening the NanoAODGen file. You can then make the rivet plots using the following command:
rivet-mkhtml --mc-errs standalone_DY.yoda
In the previous example we considered 13 TeV collisions, with the conditions from 2018. To generate events for Run 3, one would first need to change the line in the fragment specifying the collision energy to Herwig:
'set EventGenerator:EventHandler:LuminosityFunction:Energy 13000.0'
Then, one needs to re-run cmsDriver with the appropriate conditions:
cmsDriver.py CMS_Herwig_tutorial_2024/standalone/python/DYToLL_TuneCH3_13TeV_herwig7_cff.py --python_filename DYToLL_TuneCH3_13TeV_herwig7_cfg.py --eventcontent RAWSIM,NANOAODGEN --datatier GEN,NANOAOD --fileout file:standalone_DY.root --conditions 124X_mcRun3_2022_realistic_v12 --beamspot Realistic25ns13p6TeVEarly2022Collision --step GEN,NANOGEN --geometry DB:Extended --era Run3 --no_exec --mc --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" -n 5000
Try running the config thus produced (in a new test directory). You can then compare the plots at the two different collision energies by passing them both to rivet-mkhtml:
rivet-mkhtml --mc-errs standalone_DY_Run2.yoda standalone_DY_Run3.yoda
To produce the full configs for the external lhe examples from the fragments, one just needs to add "LHE" to the --step
argument:
cmsDriver.py CMS_Herwig_tutorial_2024/external_lhe/python/DY_LO_MG_Hw_cff.py --python_filename DY_LO_MG_Hw_cfg.py --eventcontent RAWSIM,NANOAODGEN --datatier GEN,NANOAOD --fileout file:LO_MG_DY.root --conditions 106X_upgrade2018_realistic_v4 --beamspot Realistic25ns13TeVEarly2018Collision --step LHE,GEN,NANOGEN --geometry DB:Extended --era Run2_2018 --no_exec --mc --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" -n 5000
The configs can then be run in the same way as before:
mkdir -p test/LO_MG
cd test/LO_MG
cp ../../DY_LO_MG_Hw_cfg.py .
cmsRun DY_LO_MG_Hw_cfg.py
Try changing these commands to also run the NLO and 2 additional jets examples.
Matchbox is a mode of Herwig that allows to use external (NLO) ME providers to do the showering and the hadronization, all in one go within Herwig i.e., no need to produce LHE files and pass them for the matching/merging/hadronization to Herwig.
A first sample with NLO DY->ll exists for Run 2 with UL conditions, consisting of 120 M events and is available in DAS.
Here we will learn how to find ourselves configuration used for it. We can generate, if we wish, few events of this sample without modifying anything using singularity. We can find the prep-id used for GEN step by going to the parent AOD in DAS and clicking on the dbs3show.
The prep-id used for this sample in MCM is the PPD-RunIISummer20UL18GEN-00020. Now we will go to MCM and get the test command.
Generating some events interactively will need ~20 minutes of time. It is better that we login again to lxplus into a new session and do this in our tmp folder. We will let it run in a separate terminal during the tutorial session and look back at it in the end.
cd /tmp/$USER
wget https://cms-pdmv-prod.web.cern.ch/mcm/public/restapi/requests/get_test/PPD-RunIISummer20UL18GEN-00020
source PPD-RunIISummer20UL18GEN-00020
As next part of our tutorial, we will find the configuration used for the Run 2 ultralegacy sample sample. We can go to the edit details and find out the configuration fragment, as well as the cross section of this sample, which is 6048 pb.
The configuration for this sample has been extracted from MCM and placed in CMS_Herwig_tutorial_2024/matchbox/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py
.
For this part of the tutorial, we will use lxplus7.
ssh lxplus7
export SCRAM_ARCH=slc7_amd64_gcc700
cmsrel CMSSW_10_6_38
cd CMSSW_10_6_38/src
cmsenv
git clone git@github.com:Dominic-Stafford/CMS_Herwig_tutorial_2024.git
if git clone doesn't work for you, just cp the folder
cp -r /afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024/CMS_Herwig_tutorial_2024 .
Build cmssw so that it can find the gen fragments:
scram b -j 4
mkdir -p test/matchbox
cd test/matchbox
cmsDriver.py CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py --python_filename DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cfg.py --eventcontent RAWSIM,NANOAODGEN --customise Configuration/DataProcessing/Utils.addMonitoring --datatier GEN,NANOAOD --fileout file:DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7.root --conditions auto:mc --step GEN,NANOGEN --geometry DB:Extended --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" --no_exec --mc -n 5000
# This will speed-up generation for the tutorial
# Do not do it, if you make any changes in the configuation
cp /afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024/Herwig-cache.CMSSW_10_6_38.lxplus7.tar.bz2 .
tar -xjvf Herwig-cache.CMSSW_10_6_38.lxplus7.tar.bz2
cmsRun DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cfg.py
or alternatively run the cmsRun in the background
cmsRun DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cfg.py >& output.txt &
disown %1
We have to edit the Herwig configuration inside the DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py
fragment. For this purpose will will clone this configuration into a new file
cp $CMSSW_BASE/src/CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py $CMSSW_BASE/src/CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cff.py
and edit it inside
nano $CMSSW_BASE/src/CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cff.py
so that the following lines appear:
'# read Matchbox/MCatNLO-DefaultShower.in',
'read Matchbox/MCatNLO-DipoleShower.in',
'# read Matchbox/FiveFlavourScheme.in',
'read snippets/DipoleShowerFiveFlavours.in',
In the end, do not forget to run
scram b
so that CMSSW is informed for the new card. You can find an (untested) version of the configuration in MCM PPD-RunIISummer20UL18GEN-00019
mkdir dipole
cd dipole
cmsDriver.py CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cff.py --python_filename DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cfg.py --eventcontent RAWSIM,NANOAODGEN --customise Configuration/DataProcessing/Utils.addMonitoring --datatier GEN,NANOAOD --fileout file:DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole.root --conditions auto:mc --step GEN,NANOGEN --geometry DB:Extended --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" --no_exec --mc -n 5000
# This will speed-up generation for the tutorial
# Do not do it, if you make any changes in the configuation
cp /afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024/Herwig-cache_dipole.CMSSW_10_6_38.lxplus7.tar.bz2 .
tar -xjvf Herwig-cache_dipole.CMSSW_10_6_38.lxplus7.tar.bz2
cmsRun DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cfg.py >& output.txt &
Similarly, more cards could be tested modifying accordingly the commands passed inside Herwig through the CMSSW-Herwig-interface. Contact us if you want to get help on any particular process!
In the end of the exercise you should have four ROOT files two in NANAOD format.
DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_inNANOAODGEN.root
DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_inNANOAODGEN.root
Comparing their output should be simple with any ROOT or python analyzer. A simple ROOT script which runs over premade ROOT files from the exercise discussed is the following:
ssh -X username@lxplus9.cern.ch # or ssh -Y if you are from mac
root
Copy and paste the following lines in your ROOT console:
TFile *fp1 = new TFile("/afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024_files/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_inNANOAODGEN.root");
TFile *fp2 = new TFile("/afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024_files/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_inNANOAODGEN.root");
TTree *t1 = (TTree*) fp1->Get("Events");
TTree *t2 = (TTree*) fp2->Get("Events");
t2->SetLineColor(kRed)
t2->SetLineStyle(2)
t1->Draw("GenDressedLepton_pt>>h1(35,0,70)")
t2->Draw("GenDressedLepton_pt>>h2(25,0,70)","","same")
The script should produce this output:
Try generating more statistics and checking more variables like dilepton mass, Pt, Y.