Event Processing Rate
mcitron opened this issue · 2 comments
Hi,
While running CMG (CMGTools-from-CMSSW_7_2_3) to make ntuples we noticed that the maximum speeds we're getting are around 30-50 events/second. During Run 1 the ntuplising stage usually ran at O1000s events/second. Do you know what could be causing such a slower speed and how it could be optimised? We’ve already exported the complex variable (alphaT) calculation to a C module.
Best,
Matthew
(copy-paste of my answer on the egroup)
Hi Matthew,
Speed may depend on many things:
- what filters are you applying, and how early are you applying them in the sequence
- what data are you reading, especially if you're accessing the full collections of packed PF candidates or Gen candidates
- how much much more data you're writing out
Especially (1) can make a huge difference, e.g. if you were filtering on trigger bits in 5.3.X but not in 7.2.X, given that the trigger filter is at the very beginning of the path.
Can you try make an apples-to-apples test like using MC samples with the same kinematics (e.g. if you use the same HT bin of QCD MC, then events should be similar even if you're pitching 8 TeV vs 13 TeV), and with the same filtering?
In recent enough versions of the 7.2.X branch you can get a report of per-module timing passing the --timereport option to heppy
Some things are slower by design in 7.2.X since they have to compute on the fly things that were precomputed at cmgtuple stage in 5.3.X (e.g. MVA electron id, quark-gluon likelihood, ...), and some new things in 7.X are slow (e.g. MC matching for Photons or IVFs), but much of this can be turned off at cfg level if you don't want them (and much is off my default)
Hi all,
I would like to remind that we have noticed a serious drop in processing speed,
even for relatively simple tasks.
We also saw that the processing can be extremely fast at times; in fact you may see that sometimes it takes about 20 seconds to complete a given process, and sometimes a few seconds … Even during a given run, you can sometimes see an acceleration of processing speed for ~ 1k events.
This phenomenon does not seem to be related to fluctuating performance of afs,
and I’m pretty sure the problem is not related to heppy itself.
We still need to:
- compare processing speed between 7X and 53X for C++ FWLte
- do the same comparison for python FWLite (without heppy)
- understand why the processing is sometimes much faster
These tests should be done with a reasonable physics processing task, the same in python and C++,
and should be performed on a statistical basis, timing at least 20 runs for the same task and histogramming the time.
Does anybody want to investigate?
Cheers,
Colin
Le 26 janv. 2015 à 11:27, Giovanni Petrucciani notifications@github.com a écrit :
(copy-paste of my answer on the egroup)
Hi Matthew,
Speed may depend on many things:
- what filters are you applying, and how early are you applying them in the sequence
- what data are you reading, especially if you're accessing the full collections of packed PF candidates or Gen candidates
- how much much more data you're writing out
Especially (1) can make a huge difference, e.g. if you were filtering on trigger bits in 5.3.X but not in 7.2.X, given that the trigger filter is at the very beginning of the path.
Can you try make an apples-to-apples test like using MC samples with the same kinematics (e.g. if you use the same HT bin of QCD MC, then events should be similar even if you're pitching 8 TeV vs 13 TeV), and with the same filtering?
In recent enough versions of the 7.2.X branch you can get a report of per-module timing passing the --timereport option to heppySome things are slower by design in 7.2.X since they have to compute on the fly things that were precomputed at cmgtuple stage in 5.3.X (e.g. MVA electron id, quark-gluon likelihood, ...), and some new things in 7.X are slow (e.g. MC matching for Photons or IVFs), but much of this can be turned off at cfg level if you don't want them (and much is off my default)
—
Reply to this email directly or view it on GitHub.