argiopetech/base

param matrix bounds overrun

Closed this issue · 4 comments

The following code causes array overruns when called with creative values of ctrl.burnIter (like 50). This occurs and does not segfault in 9.2.1, and triggers std::out_of_range in 9.3.0_pre due to use of .at().

params.at(p).at((iteration - ctrl.burnIter / 2) / increment) = mc.clust.parameter[p];

It'd be good if we could get Nathan to take a look at this. It's a pretty basic integer rounding issue, but I'd like to make sure we're filling that matrix properly.

agreed and I don't understand that line at all. I'll email him with your
question ...

On 7/11/13 12:33 AM, Elliot Robinson wrote:

It'd be good if we could get Nathan to take a look at this. It's a pretty basic
integer rounding issue, but I'd like to make sure we're filling that matrix
properly.


Reply to this email directly or view it on GitHub
#28 (comment).

Ted von Hippel

Department of Physical Sciences
Embry-Riddle Aeronautical University
600 S. Clyde Morris Boulevard
Daytona Beach, FL 32114-3900
386-226-7751

From Nathan Stein:

Ah, I see the problem. This is saving every (increment)th parameter value during the second half of the burn-in, so that the covariance matrix of these parameters can be calculated at the end of the burn-in. Somewhere in the code I hardcoded nSave, the number of parameter values to save for this calculation. (I just looked at one version of the code I have, and this was set to 10, but I think 100 would be a much better value.)

Of course, in order to have nSave values available, burnIter should be at least 2 * nSave, so it would be good to add a check for this (Fix 1).

But what Elliot caught is more subtle. The reason it fails if ctrl.burnIter = 50 even if nSave = 10 is that increment, which is set to ctrl.burnIter / (2*nSave) (or 50/20 here), becomes 2 in this case. We start saving values at iteration 25, and then try to save every other one, until the end of the burn-in. So we try to save the values at iterations 25, 27, 29, ..., 49. There are 13 such values, but we only allocated params.at(p) to have length 10.

One simple fix (Fix 2) would change

if (iteration >= ctrl.burnIter / 2 && iteration < ctrl.burnIter)

to

if (iteration >= ctrl.burnIter - nSave * increment && iteration < ctrl.burnIter)

so that we don't try to save too many parameter values. (Please double check that this doesn't result in any off-by-one errors!)

I think we need both Fix 1 and Fix 2, but if you can think of a better/simpler way to make all this work, let me know.

Hi Nathan,
This issue has worked its way back up to the top of the bug fix heap, but there's a few more things I'd like to look at.

First, the current line where nSave is assigned says /*changed from 100 to 10 */. This implies that in the past we had reason to reduce its size. You mentioned in your email that 100 would be better, so do we want to change it back? Do we recall what the previous reason was for reducing it and if that reason is still valid?

I don't remember why I changed that (I don't even remember changing it), but I suspect it was just so that I could run some tests with a very short burn-in. I think we should change it back, unless we run into other problems.

Next, for Fix 1, burnIter should be 2 * nSave. Is it reasonable to change the value of nSave to be half of burnIter if burnIter is less than twice nSave, or do we need to exit with error in the event that burnIter is too low? Would it be reasonable (or sensible) to have nSave always equal to 0.5 * burnIter, or would this be a pointless use of memory when we run with burnIter 2000?

I don't think we need nSave to be greater than about 100, but it should be greater than nParamsUsed.

If burnIter is too low, it might be nicer to set burnIter to the minimum legal value and print a warning, rather than exiting with an error.

How about something along the lines of:

if (burnIter < 2*(nParamsUsed+1)) {
   print warning
   burnIter = 2*(nParamsUsed+1)
}
nSave = min(0.5 * burnIter, 100)

Last, for Fix 2, This reduces somewhat the range over which we're saving values. Is this a concern? If not, is there a reason we can't just save the last nSave values from the burnin?

If the chains have a lot of correlation during the burn-in, spacing them out a bit helps give us a more reliable estimate of the covariance matrix. It's not a problem to reduce the range a little, but we should space them out as much as we can.