tdgrant1/denss

suggestion: add fit to the actual experimental data

dfranke76 opened this issue · 1 comments

After running DENSS a few times now, I come to miss the "fit to the experimental data" with all data points.

When running denss.py (e.g. "denss.py -u nm -d 105 -m slow -f SASDA68.out"), a .fit is created. In this fit file, the "experimental data" points seem to be the desmeared and extrapolated data from the .out file used as input; this data has been assigned error estimates of unknown provenance?! While I can see that this makes sense internally, for a user it would be nice to get the fit (and chi-square) to the actual experimental data as well (what elsewhere would be called a .fir file, "fit to real data").

It might be me and my usage, but oddly, when I run denss.all.py, there is no such fit file with as described for the single case, but only multiple "_map.fit" with about one-tenth of the data points?

So it turns out this issue is actually due to a combination of understanding what the .fit files are, and also a bug reading GNOM .out files on my part (sorry about that).

In your case, denss should have never made the .fit file you're referring to (the bug). There are two .fit files. The one from the density map reconstruction is the _map.fit file. That is the calculation of the intensities from the density map. Note that DENSS uses profiles that are much lower sampling than experimental data, as described in the original paper, which is why there's only a handful of points in the _map.fit file (the log file will give you more details on the sampling). This is why a smooth fit is required for DENSS to take advantage of the oversampling provided by experiment.

However, a relatively recent feature is the ability to fit raw data with a smooth function (kind of like GNOM does). In such cases, denss also saves the .fit file (which I believe is what you're referring to). That does not have anything to do with the density map. DENSS attempts to decide if the data are raw data and need to be fit with a smooth function (just simply based on whether I(q=0) exists). When using a GNOM .out file that is not necessary, so DENSS shouldn't have fit the data and saved the .fit file in your case. It turns out that was due to a bug introduced in an update when reading .out files. I'm fixing that currently and will push it soon.

On a broader note, it is highly non-trivial to transform the low sampled DENSS fit to the highly sampled experimental data. As mentioned on the website, the calculation of chi^2 there is not meant to be used as a metric quantifying in absolute terms the quality of the fit (which instead should be done qualitatively by looking at the fit.png plot residuals), but is simply meant as a means of assessing convergence, i.e. how does chi^2 change as the reconstruction progresses. Currently I'm doing the opposite in a highly simplified way, i.e. transforming the highly sampled smooth fit to the experimental data to the low sampled q values DENSS uses (just by simple interpolation). I agree that it would be good for users to have an actual calculation of the chi^2 value compared to their experimental data, but I don't currently have that feature as it is not straight-forward to do. I'll see what I can do in the future to add that.