Unexplainable NFW sampling discrepancy
Opened this issue ยท 11 comments
Hello hello, while writing our own HOD code with @bhorowitz and others we were trying to compare our results to halotools as God's truth, but were finding some residual differences in the 2pt functions that I can't explain.
I have isolated at least one potential thing that could explain our discrepancies, related to NFW sampling of the satellites. And actually, as far as I understand, I seem to be getting inconsistent results within halotools itself depending on whether I sample manually positions with mc_generate_nfw_radial_positions
or if I sample satellites for the catalog normally.
I made a minimal demo notebook here. and I'm getting this sort of discrepancy on the radial distribution of satellites :
And this amount of difference actually can explain away the differences I'm seeing downstream in the 2pt function.
If someone could look at the demo notebook and explain to me why the two histograms don't match, I would be sooooo happy.
we were trying to compare our results to halotools as God's truth
I just checked the Contributor List, but I don't see anybody with this name on it. If I run a quick "git blame" check on the relevant section of source code, it looks like the populate_mock
function was just written by some guy ;-)
Based on a quick inspection of your notebook (which is very clear, thanks!), it looks like the populate_mock function might not be properly calling the Monte Carlo generator of radial positions. This is an important bug to fix before the next release, so I'll get to this as soon as I can.
Thanks so much!
I'm still not sure, but it looks like this might be related to the use of a lookup table for the concentration. For some values of the manually-overridden concentration, I get agreement with the NFW reference, and for others I don't. Here are two examples where the left-hand panel is the same diagnostic plot you wrote, and the right-hand panel is just the fractional difference, with sign convention defined by (mock - reference) / reference
@EiffL here's my branch where I tried the more finely spaced concentrations - https://github.com/astropy/halotools/tree/mockpop_bugfix could you try running your tpcf test to see whether this improves the discrepancy you have been finding?
Actually, you don't need to use a different branch. In the current master
branch, you can test this hypothesis by passing in the concentration_bins
argument to the PrebuiltHodModelFactory
.
Thanks a lot @aphearin I tried your fix and it does indeed seem to improve the NFW profile quite a bit. I'm trying to check what happens at the level of the 2pt functitons, and will post plots when I get some convincing results
so yeah, looks like this is solving the problem!
Great, thanks for independently confirming that this is the issue. I'm glad this turned out to have a simple resolution.
But looks like changing dc makes the sampling code way slower for some reason
Right, yes, there's a trade-off between the size of the lookup table and the performance. I could try to optimize this a bit and see how it goes, though I think the real solution here is for the NFW profile to be reimplemented to be bin-free using the analytical solution for the CDF inverse. Originally, I had thought that it was a good idea to develop this lookup table machinery since it would be more general, covering cases that had no closed-form analytical solution for the inverse CDF, and thinking that people would want to try out all manner of different profiles. But over time, it became clear that people just wanted to use NFW, and so this feature no longer seems so important.