Inconsistency in gen-jet matching between jetMetUncertainties and SmearedJetProducerT (recommendation)

Question

Inconsistency in gen-jet matching between jetMetUncertainties and SmearedJetProducerT (recommendation)

pieterdavid opened this issue 5 years ago · 22 comments

If I am reading the code correctly, the jetmetUncertainties module is using looser criteria for matching to generator-level jets (closest, within DeltaR < 0.4, see here, called from here and used here), than the recommendations on the JetResolution twiki (DeltaR < 0.2, i.e. cone size / 2, and the absolute value relative pT difference smaller than three times the pT resolution), which is also implemented in SmearedJetProducerT (with the configuration here).
Has the recommendation been changed, or is this a bug in jetMetUncertainties?

Answer 1 · 2019-12-06T17:12:29.000Z

hi @pieterdavid, thanks for pointing this out, I'm also very interested in the answer to this question. To be sure we are on the same page, I implemented a fix for testing in this commit.

Is this what you had in mind?

Answer 2 · 2019-12-06T18:21:49.000Z

Thanks @AndreasAlbert ! Yes, I think we're on the same page. Did you try to run your code? I think you should either move resolution_matching out of the class definition, or add @staticmethod and pass it with presel=jetmetUncertaintiesProducer.resolution_matching (I would also change matchObjectCollection because checking the distance is probably faster than finding the resolution and applying the cut on dPT, but we can do that when it's clear what should be implemented).

Answer 3 · 2019-12-06T18:30:49.000Z

Yeah, when I run it, I see relatively large differences in jet pt, so it seems to be a significant effect (looking at ZJetsToNuNu HT 400-600).

I agree that the implementation is hacky for now. I can't tell whether the suggestion you are making would affect the function of the code. Would it? Or are you proposing it for clarity

Answer 4 · 2019-12-06T18:43:31.000Z

Sorry, I missed an indent when reading the diff - this should work and be equivalent indeed

Answer 5 · 2019-12-09T22:51:06.000Z

Small update: The implementation linked above is incorrect because the resolution value is relative, not absolute. Therefore, the selection rule is:

(jet.pt() - genjet.pt()) < 3 * resolution * jet.pt()

rather than

(jet.pt() - genjet.pt()) < 3 * jet.pt()

A corrected commit is here. This change removes many of the differences I had seen previously, indicating that those were due to the overly tight matching criterion you get when you do not multiply by pt. To quantify the (now smaller) effect of the implementation change, I will have to look at more than a few events.

Answer 6 · 2019-12-10T08:24:57.000Z

Thanks @AndreasAlbert for the checks! I also only looked at a few events, so I don't really know what to expect (it's not excluded that in distributions things average out and the difference is quite small in the end).

@camclean @alefisico could you also have a look and comment? Thanks

Answer 7 · 2019-12-10T14:42:57.000Z

A typical pattern I see now is that the new and old implementations in nanoaod agree well, but low-pt jets disagree between nanoaod and miniaod. Example event:

  Jet    Nano old    Nano new    Mini    Old/mini-1    New/mini - 1    New/old - 1
-----  ----------  ----------  ------  ------------  --------------  -------------
    0      247.86      247.86  247.91         -0.00           -0.00           0.00
    1      177.73      177.73  177.70          0.00            0.00           0.00
    2       30.89       30.89   30.91         -0.00           -0.00           0.00
    3       19.23       19.23   19.23          0.00            0.00           0.00
    4       13.59       13.59   13.13          0.04            0.04           0.00

This is in a sample with Z+4 partons @ LO, so ~4 matched jets are expected. The low-pt jets are likely to be unmatched, which explains why they behave different than the high-pt ones. This leads to one of two possible conclusions:

a) The random number used for smearing unmatched jets is different in nanoaod and miniaod. The random seed per event is set here in nanoaod-tools and here in miniaod, which at face value look consistent, although I haven't tested. @IzaakWN I assume you tested already while implementing?

b) If the random number is the same, there must be a difference in how the correction is applied.

I think we will need to understand which one it is.

Answer 8 · 2019-12-10T14:53:03.000Z

The jet pt is saved in NanoAOD with a limited numerical precision (10-bit mantissa), in order to save space:

https://github.com/cms-nanoAOD/cmssw/blob/06ba8571334602a31c0368b21efa5026fbf19111/PhysicsTools/NanoAOD/python/jets_cff.py#L234

Can you please check that this is not what you're observing?

Answer 9 · 2019-12-10T14:59:12.000Z

It's a while back, but I have not run the miniAOD code to compare the seed. Would it be easy for you to add a printout of whether the jets are unmatched, and what seed is used? If it's different, my first guess would be that the first jet in the collection has changed (and thus jet0eta), otherwise that something went wrong with the long conversion, or bitwise shift.

Answer 10 · 2019-12-10T15:05:30.000Z

@peruzzim I see differences of >5%, sometimes even 10-20%. Naively, I would not expect that size effect just from the rounding, right?

@IzaakWN OK, I will compare the seeds then.

Answer 11 · 2019-12-10T15:07:02.000Z

@peruzzim what about jet eta though? Is that also 10 bits? Can't really tell from the file you link

Answer 12 · 2019-12-10T15:10:34.000Z

If I see correctly in the code, eta and phi should be saved with 12 bits.
And indeed, no effect should be at the 10-20% level, they should rather be below the 1% level, unless something in the smearing procedure amplifies it (bin migration?).

Answer 13 · 2019-12-10T23:28:10.000Z

A first check of the seeds shows that the seeds agree between nanoaod and miniaod in almost all cases, see [1]. In 99 out of 100 cases, I see the same seed. These are the same events I looked at previously for the jet pt comparison, so it is clear that almost none of the difference seen here can be explained by incorrect seeds.

(Note that the 'leading jet eta as integer' variable does sometimes not agree in the table, which is due it being unsigned in C++ and signed in python. This is just a technical artifact, the bit values and resulting seeds still agree.)

[1] https://docs.google.com/spreadsheets/d/1u1xPuuNsQWx8DpaTPj543-l8csG-Cl7kNya2MnYh_4o/edit?usp=sharing

Answer 14 · 2019-12-11T10:50:10.000Z

In the 10 (2016 DY MC, NanoAODv5) events I am using for testing I found one jet that is matched with a gen-level jet with the code in master, but not with the patch applied (the seeds must be the same in this case). Printout from a quick comparison (where "ref" is master and "test" master + the latest patch by @AndreasAlbert ):

Event (1, 183967, 87414527) Jet #3 pt=24.593750  DIFF
    ref pt_nom    =24.171104, test pt_nom    =25.437929
    ref pt_jerUp  =24.000845, test pt_jerUp  =25.609318
    ref pt_jerDown=24.341364, test pt_jerDown=25.235062

(the difference depends on the random number used in one case, so I'd have to run over some more events before concluding anything meaningful - but this does shows that it can be significant for individual jets)

Answer 15 · 2019-12-11T18:50:46.000Z

I made made a bigger comparison table for 2018 ZJetsNuNu. I see similar effects to what you describe.

Two interesting features:

even if all smeared jet pts agree between old/new/miniaod, the smeared MET can still disagree between the two, which indicates that there must be something else going on beside the pure jet pt values (see e.g. event 10 in the table).
In some cases, the number of jets in the smeared collection in miniaod is smaller than in nanoaod. Before smearing, additional jets are also present in miniaod, but then they are lost in smearing, which is strange. This can also happen at relatively high pt, see e.g. event 91, where a ~300 GeV jet vanishes.

As for the question of why the smearing is different for identical seeds, I think the cause is the usage of different randomness generators. In NanoAOD, TRandom3 is used:

self.rnd = TRandom3(seed)
(...)
rand = self.rnd.Gaus(0,jet_pt_resolution)
smearFactor = 1. + rand * math.sqrt(jet_pt_sf_and_uncertainty[central_or_shift]**2 - 1.)

while in MiniAOD, it is:

m_random_generator = std::mt19937(seed);
std::normal_distribution<> d(0, sigma);
smearFactor = 1. + d(m_random_generator);

AFAICT, TRandom3 and mt19937 nominally implement the same algorithm, but from a simple check I can tell that for a given seed they do not give identical random numbers (which is expected given their independent implementations).

In the end, we will have to compare actual distributions to understand whether the results are statistically consistent.

Answer 16 · 2019-12-12T08:57:20.000Z

Indeed, TRandom3 and std::mt19937 give different random numbers (from a quick check the actual random 32-bit integers are the same, but TRandom3::Gaus and std::normal_distribution do not turn them into the same floats).
The difference in MET is likely due to the smearing of the low-PT jets also being different (matching to a gen-level jet should be less likely for those, so different criteria may more often have an impact).

Answer 17 · 2019-12-12T15:23:42.000Z

The thing that confused me was that I sometimes get MET differences even if all jets agree. E.g. this event:

Object        Nano old    Nano new    Mini    Old/mini-1    New/mini - 1    New/old - 1
----------  ----------  ----------  ------  ------------  --------------  -------------
MET no jer       60.24       60.24   60.22          0.00            0.00           0.00
MET + jer        57.16       57.16   60.18         -0.05           -0.05           0.00
Jet 0           222.76      222.76  222.79         -0.00           -0.00           0.00
Jet 1           220.95      220.95  220.86          0.00            0.00           0.00
Jet 2            81.13       81.13   81.09          0.00            0.00           0.00
Jet 3            31.12       31.12   31.12         -0.00           -0.00           0.00

Answer 18 · 2019-12-12T16:19:37.000Z

The CorrT1METJet jets are smeared and used for the MET, but the results not stored in the postprocessed tree (they are there starting from NanoAODv5) - it is a bit confusing indeed, I noticed when adding a print statement in the jetSmearer.

Answer 19 · 2019-12-12T16:27:09.000Z

That makes a lot of sense. So to verify that everything is in order, the way to go is to temporarily store the low-pt jets, too

Answer 20 · 2019-12-12T22:23:05.000Z

I made a comparison of the jet and met pt spectra with 10k Znunu events. The gen matching change has no statistically significant effect in events with real MET. The distributions also agree well with MiniAOD. I could imagine that there would be a bit more of an effect in events dominated by fake MET, e.g. DY, but I cannot imagine it to be very big in any case.

Slides are here, the password is your favorite experimental collaboration in all lowercase.

Bottom line: We should run this by JME and make a PR, but from my point of view this is a non-issue.

Answer 21 · 2019-12-17T17:17:17.000Z

Thanks a lot @AndreasAlbert for the comparisons! Indeed, it doesn't seem to cause big differences.
One more point: by using a small wrapper it may actually be possible to have the same random numbers here as in MiniAOD, in case that is considered useful.

Answer 22 · 2019-12-17T20:51:59.000Z

I just discussed this with @fgolf for a bit and we agreed that it would be desirable in the long run to have consistent random numbers between nano and miniaod, although with low urgency. @pieterdavid if you want to implement this, I think it would be welcome!