MTIE calculation crashes trying to create too big array

Question

MTIE calculation crashes trying to create too big array

Opened this issue 5 years ago · 7 comments

Hi

When calculating MTIE for a data array of 436365 elements length I've got an error saying:

Traceback (most recent call last):
File "C:/Work/Projects/TimeErrorSimulationGUI/SupportingFiles/tDEVCalcCheck.py", line 137, in
tauArr, mtieAllanTools, a, b = allantools.mtie(freqTrend, 1 / 0.033, 'freq', taus=tauArr)
File "C:\Work\Projects\TimeErrorSimulationGUI\GUI\allantools.py", line 1086, in mtie
rw = mtie_rolling_window(phase, int(mj + 1))
File "C:\Work\Projects\TimeErrorSimulationGUI\GUI\allantools.py", line 1053, in mtie_rolling_window
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
File "C:\Users\antonp\AppData\Local\Programs\Python\Python36-32\lib\site-packages\numpy\lib\stride_tricks.py", line 102, in as_strided
array = np.asarray(DummyArray(interface, base=x))
File "C:\Users\antonp\AppData\Local\Programs\Python\Python36-32\lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: array is too big; arr.size * arr.dtype.itemsize is larger than the maximum possible size.

Error happens when window size reaches 1515 elements and software tries to create 436365 x 1515 array. Used Python 3.6.5.

In order to overcome the issue I've added piece of code which gets in play when original code reaches size limit (see except part of try/except block below) and continues calculation:
`

def mtie(data, rate=1.0, data_type="phase", taus=None):
""" Maximum Time Interval Error.

Parameters
----------
data: np.array
    Input data. Provide either phase or frequency (fractional,
    adimensional).
rate: float
    The sampling rate for data, in Hz. Defaults to 1.0
data_type: {'phase', 'freq'}
    Data type, i.e. phase or frequency. Defaults to "phase".
taus: np.array
    Array of tau values, in seconds, for which to compute statistic.
    Optionally set taus=["all"|"octave"|"decade"] for automatic
    tau-list generation.

Notes
-----
this seems to correspond to Stable32 setting "Fast(u)"
Stable32 also has "Decade" and "Octave" modes where the
dataset is extended somehow?
"""
phase = input_to_phase(data, rate, data_type)
(phase, m, taus_used) = tau_generator(phase, rate, taus)
devs = np.zeros_like(taus_used)
deverrs = np.zeros_like(taus_used)
ns = np.zeros_like(taus_used)

for idx, mj in enumerate(m):
    try:
        rw = mtie_rolling_window(phase, int(mj + 1))
        win_max = np.max(rw, axis=1)
        win_min = np.min(rw, axis=1)
        tie = win_max - win_min
        dev = np.max(tie)
    except:
        if int(mj + 1) < 1:
            raise ValueError("`window` must be at least 1.")
        if int(mj + 1) > phase.shape[-1]:
            raise ValueError("`window` is too long.")

        mj = int(mj)
        currMax = np.max(phase[0:mj])
        currMin = np.min(phase[0:mj])
        dev = currMax - currMin
        for winStartIdx in range(1, int(phase.shape[0] - mj)):
            winEndIdx = mj + winStartIdx
            if currMax == phase[winStartIdx - 1]:
                currMax = np.max(phase[winStartIdx:winEndIdx])
            elif currMax < phase[winEndIdx]:
                currMax = phase[winEndIdx]

            if currMin == phase[winStartIdx - 1]:
                currMin = np.min(phase[winStartIdx:winEndIdx])
            elif currMin > phase[winEndIdx]:
                currMin = phase[winEndIdx]

            if dev < currMax - currMin:
                dev = currMax - currMin

    ncount = phase.shape[0] - mj
    devs[idx] = dev
    deverrs[idx] = dev / np.sqrt(ncount)
    ns[idx] = ncount

return remove_small_ns(taus_used, devs, deverrs, ns)`

Answer 1 · 2019-04-03T14:21:22.000Z

Thanks for the bug report and the patch.
I guess the problem is that the array exceeds the amount of memory you have, as this seems to be the only limit imposed on numpy ndarrays.

I wondered why we use such a memory-intensive solution, in the first place, so I inquired a little:

mtie_rolling_window docstring reads

Make an ndarray with a rolling window of the last dimension, from http://mail.scipy.org/pipermail/numpy-discussion/2011-January/054401.html
The URL is not pointing anywhere now, I found https://mail.python.org/pipermail/numpy-discussion/2011-January/054401.html but I see no relevance (title is "aa.astype(int) truncates and doesn't round").

https://mail.python.org/pipermail/numpy-discussion/2011-January/054364.html seems more related ("Rolling window (moving average, moving std, and more)" ), maybe numbering change between mail archives... But the whole point of the discussion is precisely to enhance performance while saving memory ! And it mentions cases where it does allocate large chunks of memory (https://mail.python.org/pipermail/numpy-discussion/2011-January/054370.html)

So I see several ways forward:

insert this patch
always use the alternate version (maybe we loose a bit of performance but I doubt it if the array is allocated anyway...)
find out if, following this mailing list discussion, there is a known way to make numpy's "stride tricks" to perform as expected (i.e. don't allocate large arrays).
use ndimage filters instead

Before I dig any further, may be somebody remembers why this particular design has been chosen ?

[edit: better understanding of the 2011 discussion...]

Answer 2 · 2019-04-03T21:17:23.000Z

I guess the problem is that the array exceeds the amount of memory you have, as this seems to be the only limit imposed on numpy ndarrays

My estimation is ~5 GB for this array (436365 x 1515 x 8 bytes). Even if it wasn't double precision number (don't remember very well) it is ~2.5 GB for single precision number. I'm using 32-bit Python so it is good enough to reach memory limit. But usually Python throws Memory Error for such a cases so my first impression was that there is a size limit for arrays in numpy.

Performance of the inserted code is not very good for small tau (small window size). It is slower than original code. But it performs well for the large windows. So in connection with the original code overall performance is good.

Answer 3 · 2019-04-05T05:09:22.000Z

thanks for reporting this and suggesting a fix!

maybe an automated test would be useful also - where we can see the largest time-series that can be computed - both before and after applying this patch.

I can try to merge the code above during the weekend.

Answer 4 · 2019-07-12T17:00:47.000Z

I have applied the fix (above), but now I seem to be unable to reproduce the large memory consumption issue with the old code.
My numpy version is >>> numpy.version
'1.13.3'

@AntonPrygrodskyi can you tell us what numpy version (and OS/python you are using).
Everyone: can anyone reproduce this on some platform?

Without a test that actually triggers the new code to run it is left without test-coverage which isn't great.
Could this be an issue with numpy.stride_tricks which was already fixed?

Answer 5 · 2019-07-15T14:16:11.000Z

What data size did you use as a test ? I would bet you need something on par with your RAM + swap, which is huge on any recent machine...

@AntonPrygrodskyi hinted he was using 32bits python (running on Windows, seeing the paths), which might lower the limit at which the bug is triggered.

Anyway, I found it far too lengthy to create and process an appropriately large dataset on 64bit OS with a fair amount of RAM (8GB): I write an infinite loop, gradually increasing the size of the sample, and gave up because of execution time (during which the machine becomes less and less responsive...), but maybe one could try on a VM with very small RAM ? Not practical for unit testing though... and definitely not a 1-second test.

My suggestion would be to keep the patch (which fixes #4 by the way !...), and maybe add a warning-level log entry, something along the lines: "Specific processing due to the large size of the data set".

Answer 6 · 2019-07-15T20:50:27.000Z

I have applied the fix (above), but now I seem to be unable to reproduce the large memory consumption issue with the old code.
My numpy version is >>> numpy.version
'1.13.3'

@AntonPrygrodskyi can you tell us what numpy version (and OS/python you are using).
Everyone: can anyone reproduce this on some platform?

Without a test that actually triggers the new code to run it is left without test-coverage which isn't great.
Could this be an issue with numpy.stride_tricks which was already fixed?

@aewallin I'm using Python 3.6.5 32-bit; Windows 10 64-bit; 8 GB RAM. I'll which numpy version I'm using and will comment on this separately. If there is newer version I'll also check how it performs. As I mentioned my input dataset has length of 436365 elements. How long dataset(s) you've tried?

Answer 7 · 2019-07-18T19:34:00.000Z

@aewallin My numpy version is 1.14.5 which seem to be newer than youth. So it could be new bug rather than fixed. I'm also attaching data file I've used when faced that issue. Sample interval for this data is 33 ms.
FreqTrend_GT0.033s.zip