Fix subtle sort bug in create_covariance.py

Question

Fix subtle sort bug in create_covariance.py

Closed this issue 4 months ago · 2 comments

after create_covariance.py reads BBC output HD, the data frame is internally sorted by CID, which can cause problems with duplicates in sim data; e.g. independent generation of SNIa and SNCC can have random duplicate CIDs. Sorting by CID results in random ordering among duplicate CIDs, and thus the systematics HDs may be slightly mis-aligned. I noticed this artifact from a crazy wfit-chi2 using HDIBC method that interpolates two HDs. However, this artifact may have impacted previous simulated data samples that processed with systematics.

This bug should not impact data provided that there are no duplicates.

Answer 1 · 2024-07-24T01:49:46.000Z

The fix is to sort by redshift and CID;
replace
df = df.sort_values([ "CID"])
with
df = df.sort_values([ "zHD", "CID"])
and note that future unbinned HDs are z-sorted.

Answer 2 · 2024-07-25T16:51:02.000Z

CID duplicates are expected ?