CMS_Herwig_tutorial_2024

Setup

For this tutorial we will use CMSSW_13_3_2 on lxplus 9. To set this up, run the following commands:

# login to lxplus, we assume bash as shell 
# i.e., echo $SHELL should print out '/bin/bash'

cmsrel CMSSW_13_3_2
cd  CMSSW_13_3_2/src
cmsenv
# Clone this repository into the src directory:
git clone git@github.com:Dominic-Stafford/CMS_Herwig_tutorial_2024.git
# Build cmssw so that it can find the gen fragments:
scram b -j 4

Standalone LO generation

In this repository we provide a number of generator fragments to run the different examples. A fragment is a part of a configuration containing only the information specific to the physics process. To convert this into a full configuration one needs to use the cmsDriver command, which adds details of output formats and the conditions for the run:

cmsDriver.py CMS_Herwig_tutorial_2024/standalone/python/DYToLL_TuneCH3_13TeV_herwig7_cff.py --python_filename DYToLL_TuneCH3_13TeV_herwig7_cfg.py --eventcontent RAWSIM,NANOAODGEN --datatier GEN,NANOAOD --fileout file:standalone_DY.root --conditions 106X_upgrade2018_realistic_v4 --beamspot Realistic25ns13TeVEarly2018Collision --step GEN,NANOGEN --geometry DB:Extended --era Run2_2018 --no_exec --mc --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" -n 5000

Move the completed fragment into an appropriate test directory, and run it using the "cmsRun" command:

mkdir -p test/standalone
cd test/standalone
cp ../../DYToLL_TuneCH3_13TeV_herwig7_cfg.py .
cmsRun DYToLL_TuneCH3_13TeV_herwig7_cfg.py

This should produce a large number of files:

  • HerwigConfig.in - Herwig input file produced by CMSSW
  • standalone_DY.run - non-human readable run card produced by Herwig
  • standalone_DY-S123456790.log - Detailed Herwig log (includes full event record for a few events)
  • standalone_DY-S123456790.out - Herwig process summary (including xs)
  • standalone_DY-S123456790.tex - Herwig credits
  • standalone_DY.root - CMS GEN data-tier ROOT file for further processing
  • standalone_DY_inNANOAODGEN.root - Events in NanoAOD-like format
  • standalone_DY.yoda - Rivet histograms

Try having a look in the .in, .log and .out text files, as well as opening the NanoAODGen file. You can then make the rivet plots using the following command:

rivet-mkhtml --mc-errs standalone_DY.yoda

Extension: changing the collision energy

In the previous example we considered 13 TeV collisions, with the conditions from 2018. To generate events for Run 3, one would first need to change the line in the fragment specifying the collision energy to Herwig:

'set EventGenerator:EventHandler:LuminosityFunction:Energy 13000.0'

Then, one needs to re-run cmsDriver with the appropriate conditions:

cmsDriver.py CMS_Herwig_tutorial_2024/standalone/python/DYToLL_TuneCH3_13TeV_herwig7_cff.py --python_filename DYToLL_TuneCH3_13TeV_herwig7_cfg.py --eventcontent RAWSIM,NANOAODGEN --datatier GEN,NANOAOD --fileout file:standalone_DY.root --conditions 124X_mcRun3_2022_realistic_v12 --beamspot Realistic25ns13p6TeVEarly2022Collision --step GEN,NANOGEN --geometry DB:Extended --era Run3 --no_exec --mc --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" -n 5000

Try running the config thus produced (in a new test directory). You can then compare the plots at the two different collision energies by passing them both to rivet-mkhtml:

rivet-mkhtml --mc-errs standalone_DY_Run2.yoda standalone_DY_Run3.yoda

Showering external lhe files

To produce the full configs for the external lhe examples from the fragments, one just needs to add "LHE" to the --step argument:

cmsDriver.py CMS_Herwig_tutorial_2024/external_lhe/python/DY_LO_MG_Hw_cff.py --python_filename DY_LO_MG_Hw_cfg.py --eventcontent RAWSIM,NANOAODGEN --datatier GEN,NANOAOD --fileout file:LO_MG_DY.root --conditions 106X_upgrade2018_realistic_v4 --beamspot Realistic25ns13TeVEarly2018Collision --step LHE,GEN,NANOGEN --geometry DB:Extended --era Run2_2018 --no_exec --mc --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" -n 5000

The configs can then be run in the same way as before:

mkdir -p test/LO_MG
cd test/LO_MG
cp ../../DY_LO_MG_Hw_cfg.py .
cmsRun DY_LO_MG_Hw_cfg.py

Try changing these commands to also run the NLO and 2 additional jets examples.

Herwig in matchbox mode

Matchbox is a mode of Herwig that allows to use external (NLO) ME providers to do the showering and the hadronization, all in one go within Herwig i.e., no need to produce LHE files and pass them for the matching/merging/hadronization to Herwig.

A first sample with NLO DY->ll exists for Run 2 with UL conditions, consisting of 120 M events and is available in DAS.

Here we will learn how to find ourselves configuration used for it. We can generate, if we wish, few events of this sample without modifying anything using singularity. We can find the prep-id used for GEN step by going to the parent AOD in DAS and clicking on the dbs3show.

DAS

The prep-id used for this sample in MCM is the PPD-RunIISummer20UL18GEN-00020. Now we will go to MCM and get the test command.

MCM

Generating some events interactively will need ~20 minutes of time. It is better that we login again to lxplus into a new session and do this in our tmp folder. We will let it run in a separate terminal during the tutorial session and look back at it in the end.

cd /tmp/$USER

wget https://cms-pdmv-prod.web.cern.ch/mcm/public/restapi/requests/get_test/PPD-RunIISummer20UL18GEN-00020

source PPD-RunIISummer20UL18GEN-00020

Producing again some of the Run-2 UltraLegacy sample interactively

As next part of our tutorial, we will find the configuration used for the Run 2 ultralegacy sample sample. We can go to the edit details and find out the configuration fragment, as well as the cross section of this sample, which is 6048 pb.

MCM edit details

The configuration for this sample has been extracted from MCM and placed in CMS_Herwig_tutorial_2024/matchbox/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py.

For this part of the tutorial, we will use lxplus7.

ssh lxplus7
export SCRAM_ARCH=slc7_amd64_gcc700
cmsrel CMSSW_10_6_38
cd CMSSW_10_6_38/src
cmsenv

git clone git@github.com:Dominic-Stafford/CMS_Herwig_tutorial_2024.git

if git clone doesn't work for you, just cp the folder

cp -r /afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024/CMS_Herwig_tutorial_2024 .

Build cmssw so that it can find the gen fragments:

scram b -j 4

mkdir -p test/matchbox
cd test/matchbox

cmsDriver.py CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py --python_filename DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cfg.py --eventcontent RAWSIM,NANOAODGEN --customise Configuration/DataProcessing/Utils.addMonitoring --datatier GEN,NANOAOD --fileout file:DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7.root --conditions auto:mc --step GEN,NANOGEN --geometry DB:Extended --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" --no_exec --mc -n 5000 

# This will speed-up generation for the tutorial 
# Do not do it, if you make any changes in the configuation

cp /afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024/Herwig-cache.CMSSW_10_6_38.lxplus7.tar.bz2 .
tar -xjvf Herwig-cache.CMSSW_10_6_38.lxplus7.tar.bz2


cmsRun DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cfg.py 

or alternatively run the cmsRun in the background

cmsRun DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cfg.py  >& output.txt &
disown %1

Exercise: modify configuration to use dipole shower

We have to edit the Herwig configuration inside the DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py fragment. For this purpose will will clone this configuration into a new file

cp $CMSSW_BASE/src/CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_cff.py $CMSSW_BASE/src/CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cff.py

and edit it inside

nano $CMSSW_BASE/src/CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cff.py

so that the following lines appear:

   '# read Matchbox/MCatNLO-DefaultShower.in',
   'read Matchbox/MCatNLO-DipoleShower.in',   

   '# read Matchbox/FiveFlavourScheme.in',
   'read snippets/DipoleShowerFiveFlavours.in',   

In the end, do not forget to run

scram b

so that CMSSW is informed for the new card. You can find an (untested) version of the configuration in MCM PPD-RunIISummer20UL18GEN-00019

mkdir dipole
cd dipole

cmsDriver.py CMS_Herwig_tutorial_2024/matchbox/python/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cff.py --python_filename DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cfg.py --eventcontent RAWSIM,NANOAODGEN --customise Configuration/DataProcessing/Utils.addMonitoring --datatier GEN,NANOAOD --fileout file:DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole.root --conditions auto:mc --step GEN,NANOGEN --geometry DB:Extended --customise_commands process.MessageLogger.cerr.FwkReport.reportEvery="int(1000)" --no_exec --mc -n 5000 

# This will speed-up generation for the tutorial 
# Do not do it, if you make any changes in the configuation

cp /afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024/Herwig-cache_dipole.CMSSW_10_6_38.lxplus7.tar.bz2 .
tar -xjvf Herwig-cache_dipole.CMSSW_10_6_38.lxplus7.tar.bz2


cmsRun DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_cfg.py  >& output.txt &

Similarly, more cards could be tested modifying accordingly the commands passed inside Herwig through the CMSSW-Herwig-interface. Contact us if you want to get help on any particular process!

Exercise: Solutions

In the end of the exercise you should have four ROOT files two in NANAOD format.

DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_inNANOAODGEN.root
DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_inNANOAODGEN.root

Comparing their output should be simple with any ROOT or python analyzer. A simple ROOT script which runs over premade ROOT files from the exercise discussed is the following:

ssh -X username@lxplus9.cern.ch # or ssh -Y if you are from mac
root

Copy and paste the following lines in your ROOT console:

TFile *fp1 = new TFile("/afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024_files/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_inNANOAODGEN.root");
TFile *fp2 = new TFile("/afs/cern.ch/user/t/theofil/public/CMS_Herwig_tutorial_2024_files/DYToLL_NLO_5FS_TuneCH3_13TeV_matchbox_herwig7_dipole_inNANOAODGEN.root");

TTree *t1 = (TTree*) fp1->Get("Events");
TTree *t2 = (TTree*) fp2->Get("Events");


t2->SetLineColor(kRed)
t2->SetLineStyle(2)

t1->Draw("GenDressedLepton_pt>>h1(35,0,70)")
t2->Draw("GenDressedLepton_pt>>h2(25,0,70)","","same")

The script should produce this output:

lepton_pt

Try generating more statistics and checking more variables like dilepton mass, Pt, Y.