theta_offset is changing model calculation range

Question

theta_offset is changing model calculation range

Closed this issue 4 years ago · 20 comments

For some reason when theta_offset is a applied, it is causing an error in the calculation of the model.
With theta_offset=0:

With nonzero theta_offset:

It is not just a visual artifact either, as the chisq is strongly affected.
theta_offset_bug.zip

Answer 1 · 2021-04-09T16:32:41.000Z

This is with stable version Refl1D-0.8.14-exe.zip on Windows

Answer 2 · 2021-04-09T19:32:19.000Z

A proposed patch: don't know how it will affect performance

diff --git a/refl1d/probe.py b/refl1d/probe.py
index fd5e1d0..677828d 100644
--- a/refl1d/probe.py
+++ b/refl1d/probe.py
@@ -1562,7 +1562,7 @@ def measurement_union(xs):
     TL = set()
     for x in xs:
         if x is not None:
-            TL |= set(zip(x.T, x.dT, x.L, x.dL))
+            TL |= set(zip(x.T+x.theta_offset.value, x.dT, x.L, x.dL))
     T, dT, L, dL = zip(*[item for item in TL])
     T, dT, L, dL = [np.asarray(v) for v in (T, dT, L, dL)]
 
@@ -1730,6 +1730,10 @@ class PolarizedNeutronProbe(object):
 
     @property
     def calc_Q(self):
+        #print('calculating calc_Q...')
+        self.T, self.dT, self.L, self.dL, self.Q, self.dQ \
+            = measurement_union(self.xs)
+        self._set_calc(self.T, self.L)
         return self.calc_Qo
 
     def _set_calc(self, T, L):

Answer 3 · 2021-04-10T09:57:46.000Z

Hi,

I have taken a look at this and this occurs when you are using load4 as a list for the +ve theta_offset. If you just say load one data set per probe then it no longer does this. However, I am not convinced it is doing the right thing here either. Beyond a certain magnitude of theta offset the chi^2 returns NaN because the theory is no longer extending to the length of the data - additionally, I can recreate what you have shown with a -ve theta_offset for just one data set per probe:

and for -ve offset:

I have checked it for my TOF models and I get the same as shown in the pictures above.

For reference this is using the version pulled directly from the master branch from the repo.

See my slightly adjusted model used for this:
theta_offset_bug_AJC_edit.zip

I hope this helps with testing, your fix and diagnosing the problem. I am also happy to test the patch if it is useful.

Andrew

Answer 4 · 2021-04-10T14:01:36.000Z

Thank you Andrew! I think we've pinned down the cause now, and you're right that it wasn't doing the right thing. Paul and I have been discussing the best way to mitigate. The patch above recalculates the merged Q each time the model is adjusted and might slow things down a little - another option is to tie all the theta_offset together and apply the same theta_offset in the PolarizedNeutronProbe when calculating Q, which would have less of a performance impact but wouldn't allow theta_offset to be different for different xs.

Answer 5 · 2021-04-10T14:09:12.000Z

I think for the vast majority of cases I don't think we would have a different theta_offset for different xs. So probably that's the best route to go down.

But if we wanted to maintain flexibility (for use cases I can't immediately envisage at this point), then maybe we get the user to construct the model in a different way or we have some form of switch in there?

Let me know if you need any help from my side at all.

Answer 6 · 2021-04-12T15:15:48.000Z

@acaruana2009 if you have some models you can use to check, it would be helpful to know if the patch above causes any noticeable performance hit. I am kind of leaning toward that solution because it doesn't require patching old user models, and allows independent theta_offset in the cross-sections (though that might rarely get used)

Answer 7 · 2021-04-12T15:59:43.000Z

OK, I am happy to put together a TOF model test as we would use it for POLREF. In terms of performance testing I can run the old and new versions, on my desktop PC with a fixed number of cores for a time limit and see how the number of steps compare. If you know of a better way to performance test let me know (I have not tried the bumps profiler yet). Do you have the patch in a specific branch yet? or should I just implement the code adjustments above manually?

Answer 8 · 2021-04-12T16:44:58.000Z

I just pushed a branch called theta_offset_fix that you can use.

Answer 9 · 2021-04-13T13:30:30.000Z

Thanks Brian. I have done some rough testing, and it looks like a 30% slow down for a 4 spin state simulation using erf analytical interfaces (new vs old). This is using a simulation with a Q-range from 0.005 to 0.2 with a dq/q of 0.03.

Answer 10 · 2021-04-13T13:32:40.000Z

Well, that's pretty significant. It's hard to make performance vs. usability decisions!
Just for reference - how many layers did you have? Using analytical interfaces with a small number of interfaces will be about the worst-case for this type of performance hit (the refl calculations are actually pretty fast, so the overhead from the python model updates are the most significant fraction of the total time)

Answer 11 · 2021-04-13T13:44:36.000Z

Agreed. I thought this was the case so I wanted the worst case scenario. It was one layer. See attached (my model file is pretty messy but you get the idea).
TOF_Ttheta_Offset_performance_test.zip

Answer 12 · 2021-04-13T13:50:35.000Z

Thank you Andrew! This is very helpful. What do you think about the patch? Is it worth it to refactor to try to get back the performance? Most solutions down that road would require minor tweaks to users' scripts.

Answer 13 · 2021-04-13T14:02:02.000Z

I think the answer will be depending on what options we have. I think a refactor is worth a try, but we might want to maintain usability but allow for an option of writing the model such that you can take advantage of the refactor. E.g. users can keep their current scripts but they are going to get a speed hit if they don't change.

I assume the speed hit will be worse for more q-points? So x-rays and multiple data sets may suffer more. But as you said sliced models wont change much.

Answer 14 · 2021-04-13T15:58:55.000Z

Ok, I pushed a new version to the theta_offset_fix branch where it checks if a shared theta_offset between the cross-sections has changed before recalculating the union of the Q values. I think this is as efficient as it can get, for the case where cross-sections share a theta_offset.

If the cross-sections don't share a theta_offset value, then it recalculates the entire measurement_union at each model step, but that is a less important optimization according to everyone I've talked to. (we could have it cache the measurement union if the cross-section theta_offsets are different from each other but unchanged from the last iteration)

Answer 15 · 2021-04-13T17:03:13.000Z

Ok, I couldn't leave it alone: the version I just pushed uses the stored value of calc_Qo when the theta_offset values don't change for the cross-sections (even if they are different from each other); when they change but they are all the same, it does the minimal required calculation (adding the shared theta_offset), and when they change but they are not shared, it does the full measurement_union calculation.

This logic is almost identically efficient to the alternative calculation that relies on an additional theta_offset parameter attached to the PolarizedNeutronProbe object directly, since it still has to do the minimal calculation for a shared theta_offset.

Answer 16 · 2021-04-13T17:37:48.000Z

Thanks for all of the work on this. I have been running a test for an hour on the previous version - using the model I had before. And it is between 4-10% quicker than even without the fix.

I will try out the current addition shortly and let you know how it goes. Although I imagine it wont be too much different from the iteration I am trying since I am not using different offsets for the different cross-sections.

Answer 17 · 2021-04-13T17:40:09.000Z

@acaruana2009 Thanks very much for testing!

Answer 18 · 2021-04-13T17:43:51.000Z

No problem. I think it might be worth coming up with a reliable way to test performance in the future (I know Paul has mentioned the profiler, and I am aware of timeit etc.). But for now just running the GUI for some period of time seems to work.

Answer 19 · 2021-04-13T23:41:23.000Z

OK, so I have run the current fix for 5.5 hrs and it averages to be ~10% slower than without the fix. I think given all of the variables in my not so scientific test, this is fine. Also, as we discussed earlier, for a lot of problems this is unlikely to be the bottle neck - this is a worst case scenario.

Thanks again.

Answer 20 · 2021-04-14T03:48:49.000Z

Ok, I think I'll pull the fix into master then. I feel confident that the extra cost is similar to any solution that fixes the problem (some extra computation needs to be done to fix the bug - it is unavoidable!)
Many thanks again for testing.