explore support for variable-length sweeps
swharden opened this issue · 3 comments
Currently pyABF is hard-coded to expect fixed-length sweeps. Andrew emailed me some ABFs with variable length sweeps which pyABF has trouble with. ABF info: 2020_06_16_0000.md
and 2020_06_16_0001.md
pyABF calculates sweep length like this, which is correct for fixed-length sweeps, but incorrect for variable length sweeps:
Lines 344 to 345 in fcf0c7d
Variable-length sweeps are managed by the SynchArraySection
which pyABF finds the memory address for but doesn't use.
Line 454 in fcf0c7d
For the 2020_06_16_0000.md
the header map says SynchArraySection = [362, 8, 3]
.
362
times512
=185,344
which is a memory address near the end of the file.8
means each synch array contains 8 bytes3
means there are 3 synch arrays
Let's poke around with a hex editor at this address and see what we have
I'm guessing each array contains 2 values, each a UInt32
. This would make 3 arrays:
14479
,3540
44979
,70040
147479
,16040
Let's compare to ClampFit
Those values look like start times and durations:
- sweep 1: starts at 1.4479 s (lasts 0.3540 s)
- sweep 2: starts at 4.4979 s (lasts 7.0040 s)
- sweep 3: starts at 14.7479 s (lasts 1.6040 s)
EDIT: it seems sweep 2 start time doesn't match up. What the heck is this?
... not sure how to best modify pyABF to support this yet (fixed sweep lengths are baked in pretty deep) but at least now we know how to get all the information we need
Awesome, this information dates back to ABF1 files so we have some documentation:
https://swharden.com/pyabf/abf1-file-format.md.html#the-abf-synch-section
The ABF Synch Section
The ABF Synch array is an important array that stores the start time and length
of each portion of the data if the data are not part of a continuous gap-free
acquisition. The data section might contain equal length or variable length
sweeps of data. The Synch Array contains a record to indicate the start time
and length of every sweep or Event in the data file. The ABF reading routines
automatically decode the Synch Array when providing information about the data.
A Synch array is created and used in the following acquisition modes:
ABF_VARLENEVENTS, ABF_FIXLENEVENTS & ABF_HIGHSPEEDOSC. The acquisition modes
ABF_GAPFREEFILE and ABF_WAVEFORMFILE do not always use a Synch array.
Offset | Header Entry Name | Type | Description |
---|---|---|---|
0 | lStart | long | Start time of sweep in fSynchTimeUnit units. |
4 | lLength | long | Length of the sweep in multiplexed samples. |
I got this working in concept...
# THIS EXAMPLE ASSUMES A SINGLE CHANNEL
filePath = DATA_FOLDER + "/2020_06_16_0000.abf"
abf = pyabf.ABF(filePath)
sweepYs = []
sweepXs = []
with open(filePath, 'rb') as fb:
fb.seek(abf.dataByteStart)
for sweepIndex in abf.sweepList:
firstPoint = abf._syncArraySection.lStart[sweepIndex]
pointCount = abf._syncArraySection.lLength[sweepIndex]
sweepY = np.fromfile(fb, dtype=abf._dtype, count=pointCount)
sweepY = np.multiply(sweepY, abf._dataGain)
sweepY = np.add(sweepY, abf._dataOffset)
sweepYs.append(sweepY)
offsetSec = firstPoint / abf.dataRate
sweepX = np.arange(len(sweepY)) / abf.dataRate + offsetSec
sweepXs.append(sweepX)
plt.figure()
for i in abf.sweepList:
plt.plot(sweepXs[i], sweepYs[i])
plt.show()
Implementing this in the core pyABF library will require extreme care to ensure the existing behavior for fixed-length sweeps remains unmodified. The ABF reading function is definitely one I do not want to break 😅 I'm happy I'm protected by thousands of automated tests, but still, to avoid headaches I'll only move on this when I'm ready to approach this very carefully.
Got it all figured out, merged in, and it's now live on pypi (pyabf 2.2.6)
pip install --upgrade pyabf
The sweepY
list may now be a variable size, and if absoluteTime
is True
then sweepX
returns proper times in the recording. With variable length recordings this means gaps in the data may be present:
import pyabf
import matplotlib.pyplot as plt
abf = pyabf.ABF("2020_06_16_0000.abf")
for sweepIndex in abf.sweepList:
abf.setSweep(sweepIndex, absoluteTime=True)
plt.plot(abf.sweepX, abf.sweepY)
plt.show()