These are the exercises for the ROOT tutorial for the HASCO 2023 summer school.
We will use the Jupyter notebook to run all the exercises. To get into the right environment, please proceed as follows:
- Open a file manager and go into the
ROOT_tutorial
directory - Create a subdirectoy with a unique name (like
firstname_lastname
) - Close the file manager and open a terminal
- Type
jupyter-notebook
and hit enter - Ignore the chromium warnings
- You should now have the Jupyter notebook opened in the browser
- In the notebook, navigate into the subdirectory of
ROOT_tutorial
you created for youself in the first step - You can now create new notebooks with the
Python 3 (ipykernel)
and do the exercises!
All of the exercises will be written in Python with PyROOT. If you are stuck, don't look at the solution immediately! First, try to get help from:
- The lecture slides
- The internet (Google usually gives you the right links to the ROOT documentation and ROOT forum)
- Your colleagues
Create a notebook which builds and draws a histogram with the following features:
- The number of bins is 50 and the x axis ranges from 0 to 10.
- It is filled with random numbers distributed according to an exponential distribution which has a rate = 0.5. Suggestion: see the TRandom for generating random numbers or TH1::FillRandom
- Its line width is thicker than the default one.
Create a notebook which builds and draws a graph with the following features:
- The title of the plot is My graph.
- The x and y axis have labels
my_{X}
andmy_{Y}
respectively. - It has three points with the following coordinates (1,0), (2,3), (3,4).
- The marker is a full square. Its coulour is red.
- An orange line joins the points.
Create a notebook that follows these steps:
- Create a function with formula cos(x) and draw it.
- Create another cos(x), but scale the argument of the cosine by adding a parameter.
- Set a value for the parameter.
- Change the line color of the second function.
- Draw the second function in the same canvas as the first one.
In this exercise, you will get familiar with RooFit, because this is the library that is usually used for the liklihood fits in physics analyses.
- Download the RooFit tutorial from this repository, which can also be done by cloning this repository with
git
. - Run the notebook while trying to broadly understand that is happening in each step.
- The exercises are written at the bottom of the notebook
In this advanced exercise, you will write a little Z-Boson analysis based on the DoubleElectron
CMS Open Data from Run2012B and Run2012C. Behind the links you can also find a description of all the columns in the dataset.
Please try to implement the following steps to analyze the invariant mass spectrum of electron pairs in the dataset:
- Select events with exactly two electrons
- Define a now
Electron_p4
column of type ROOT::Math::PtEtaPhiMVector to represent the electrons - Calculate the invariant mass of the electron pairs
- Fill a histogram with the pair masses mass
- Fit the invariant mass histogram to model the main resonance peak, with a chi-square fit
- for the fit model, you can use the sum of a Gaussian and an exponential (for signal and background), selecting an apropriate range around the peak for the fit
- Produce a nice plot with the data and the fit result
Can you model the peak accurately to identify the mass of the Z boson?
Hint:
- The tutorials are an excellent reference to see RDataFrame examples
- To see the full list of RDataFrame commands, take a look at the documentation
- This (Higgs Boson Analysis)[http://opendata.cern.ch/record/12360) is a nice example of a CMS Open Data analysis with RDataFrame that your can get inspired by
- You can also use wildcards to open the ROOT files for both runs in one go:
df = ROOT.RDataFrame("Events", "~/CMS_Open_Data/*_DoubleElectron.root")
- For prototyping, it makes sense to restrict the analysis to a small range of events to get to the result faster:
Once you verified that your analysis works in principle, you can run it with the full number of events.
rdf = rdf.Range(0, 100000)
After doing the last exercise, the plot you have at the end of your Z boson analysis might look like this:
That's already great to be able to achieve such results with a few lines of code starting from real CMS open data! But there are still many things that can be improved. Try to improve some of these aspects:
- The fit function. Is the peak really just a Gaussian? What are the underlying physics processes? Maybe thinking about this could help to find a better fittng function
- The object selection. Many of the selected electrons are not actually from the Z boson: they are "fake" electrons in the detector from background noise, resulting in the exponential background in the mass spectrum. Can you improve the electron selection in the analysis to suppress this background?
- The event selection (warning, very difficult!). In the initial analysis, selected only events with exactly two reconstructed electrons. However, there are many good events that you lose like that, as you can also have true Z boson events with more reconstructed electrons (mostly fake electrons)! Can you improve the event selection to recover such cases?
- The fit diagnostics. Can you add a bottom panel to the plot that shows the difference between the fit model and data in each bin, divided by the statistical uncertainty of the data?
For this last exercise, there is not solution available, as this is a more difficult bonus exercise.