UMN-CMS/cms-WR

Code too slow

shervin86 opened this issue · 14 comments

analysis.cpp is very slow. I'm making some test to understand if there is a way to optimize the loops.
I should be able to do that before 4pm.

Coordinate with Peter and Jorge on this. Peter and Jorge started
discussing this issue yesterday evening, and Peter is looking into making
analysis.cpp faster. One suggestion Jorge made was to skip lines 464 to
493 in analysis.cpp when multiple toys are run, which begins with
if(selEvent.isPassingLooseCuts(channel) )

https://github.com/UMN-CMS/cms-WR/blob/master/bin/analysis.cpp#L464-L493

this section is only relevant to work with the dytagandprobe minitree, and
doesn't need to be re-evaluated (N-1) times when N toys are thrown. When
running more than 1 toy it would help to modify line 464 of analysis.cpp to
this:

if(selEvent.isPassingLooseCuts(channel) && loop_one)

Regards,
Sean Kalafut

Physics PhD candidate
University of Minnesota Twin Cities

On Tue, May 17, 2016 at 12:50 PM, shervin86 notifications@github.com
wrote:

analysis.cpp is very slow. I'm making some test to understand if there is
a way to optimize the loops.
I should be able to do that before 4pm.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#62

definitely, I will shortly

Regards,
Sean Kalafut

Physics PhD candidate
University of Minnesota Twin Cities

On Tue, May 17, 2016 at 2:27 PM, shervin86 notifications@github.com wrote:

Can you add what we discussed today?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#62 (comment)

The CPU time is 20% maximum running this way:
./bin/analysis --channel=MuMu --mode=DYPOWHEG
The time is spent in I/O.

I tried to make a copy of the miniTree events before the loop over the toys. It takes some time to load, but then it goes at 100% CPU.

These changes will be added and these checks will be performed today or tomorrow:

  1. When running more than 1 toy, skip selEvent.isPassingLooseCuts(channel) after the first toy is finished. This selection is only needed for studying the dytagandprobe branch, which is only done with one toy.
  2. Can we store all events from one signal or flavoursideband minitree in ram? If the answer to this is no, could we store in ram a preselected minitree with events containing at least one lepton with pt > 54, another lepton with pt > 44, and two jets with pt > 30?
  3. If the results of 2 are satisfactory, add a preselection function to the Selector class to select miniTreeEvent objects which have at least one lepton with pt>54, at least one more lepton with pt>44, and at least two jets with pt>30. Store these events in a std::vector in analysis.cpp outside the loop over all toys. Inside the toy loop, loop over this vector to process the events with signal, emu sideband, or low dilepton mass region requirements.

Another route would be using our cluster to do the toys. we have lots of cores that aren't doing a whole lot lately. But that would require transferring the minitrees and developing the splitting.

Transferring miniTrees should not be a problem (few 100 GBs).
I think that copying the miniTree in RAM is a good solution and useful both if running at CERN or in UMn

[INFO] Entries: 443707
Real time 0:03:44, CP time 15.200
To load all the events in RAM

A draft of the modified analysis.cpp can be found in branch scalesSmearings
I cannot continue working on that now. Can some continue working on that?
Please follow the instructions in doc/instructions.doc of THAT branch.

Peter could you look into that? (branch scalesSmearings)

Regards,
Sean Kalafut

Physics PhD candidate
University of Minnesota Twin Cities

On Tue, May 17, 2016 at 4:46 PM, shervin86 notifications@github.com wrote:

A draft of the modified analysis.cpp can be found in branch
scalesSmearings
I cannot continue working on that now. Can some continue working on that?
Please follow the instructions in doc/instructions.doc of THAT branch.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#62 (comment)

Sure. I'll look at it this afternoon.

Here is some summary of tests I ran on some of the largest minitrees. The tag and probe minitrees are larger, but even less pass the Preselect suggested.
Branch:
https://github.com/UMN-CMS/cms-WR/tree/analysis_preselect
Commit:
3a21d51

[INFO] Reading chain for: TTJets_DiLept_v2 miniTree_lowdileptonsideband
Loading events (nEvents = 1302667): [100%]
Real time 0:16:46, CP time 29.520
Processing events (nEvents = 487248): [100%]
Real time 0:00:40, CP time 40.130

Total ram usage was
Maximum resident set size (kbytes): 2049988

[INFO] Reading chain for: DYToEE_powheg miniTree_dytagandprobe
Loading events (nEvents = 20598344): [[100%]Real time 0:18:18, CP time 245.660
Processing events (nEvents = 189312): [100%]Real time 0:00:16, CP time 15.850

Total ram usage was
Maximum resident set size (kbytes): 993296

Great thanks Peter. The TTJets_DiLept_v2 minitrees are some of the
largest, and it is good to know that even the lowdileptonsideband minitree
from that dataset can fit in ~2 GB of ram before preselection.

Regards,
Sean Kalafut

Physics PhD candidate
University of Minnesota Twin Cities

On Tue, May 17, 2016 at 11:51 PM, Peter Hansen notifications@github.com
wrote:

Here is some summary of tests I ran on some of the largest minitrees. The
tag and probe minitrees are larger, but even less pass the Preselect
suggested.
Branch:
https://github.com/UMN-CMS/cms-WR/tree/analysis_preselect
Commit:
3a21d51
3a21d51

[INFO] Reading chain for: TTJets_DiLept_v2 miniTree_lowdileptonsideband
Loading events (nEvents = 1302667): [100%]
Real time 0:16:46, CP time 29.520
Processing events (nEvents = 487248): [100%]
Real time 0:00:40, CP time 40.130

Total ram usage was
Maximum resident set size (kbytes): 2049988

[INFO] Reading chain for: DYToEE_powheg miniTree_dytagandprobe
Loading events (nEvents = 20598344): [[100%]Real time 0:18:18, CP time
245.660
Processing events (nEvents = 189312): [100%]Real time 0:00:16, CP time
15.850

Total ram usage was
Maximum resident set size (kbytes): 993296


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#62 (comment)

They fit after preselection, not before.