Better documentation of chopping options

Question

Better documentation of chopping options

Opened this issue a month ago · 5 comments

ap_features version: 2023.1.3
Python version: 3.11.8
Operating System: Ubuntu 22.04.5 LTS

Description

Hi all,

first of, this project is really cool, and I'm really glad that a colleague that I met during a conference introduced me to your program! It seems to be able to do everything I need, i.e.: analysis of optical mapping data.

I've however been trying to get the code to work on my data, and it seems that it is unable to process the beats within my optical map recordings. I don't think I understand how the chopping of beats works: it seems like it works simply by assigning every occasion where the signal passes through the threshold to be the start/end of the beat, but it does not seem to work on my optically mapped action potentials. My entire recording is basically mapped into a single beat, which is definitely not correct. I have not tried correcting the baseline of my recording, and there is actually a significant drift in normalised signal, is this what I should do when I initialise my Beats? There unfortunately are no mentions of your baseline correction methods in your tutorial, but seems to be pretty well-documented in the package.

Would you know what could be the issue? I could provide a numpy file with my signals if it would make troubleshooting easier.

Thanks in advance!

Answer 1 · 2024-09-25T18:50:33.000Z

Hi @tk231

Really happy to hear that you find this package useful, and thanks for opening this issue. Feedback from users is very valuable in order to make a software that works in different scenarios that I might not have tested myself.

When it comes to how the beats are chopped then I think the best resource for this is the API documentation:

Here for chopping data with pacing info (i.e you use the pacing protocol to determine when to chop):
https://computationalphysiology.github.io/ap_features/docs/api.html#ap_features.chopping.chop_data_with_pacing

or the more sophisticated method without this information about pacing (which is probably the one you are referring to):
https://computationalphysiology.github.io/ap_features/docs/api.html#ap_features.chopping.chop_data_without_pacing

One thing that might mess this up is the max_window and min_window arguments, so if you know in advance an expected duration of the beat, then you might want to adjust these parameters.

However, there might be cases that I haven't thought about, so feel free to drop some sample data here, and then I can take a look.

Answer 2 · 2024-09-26T09:24:25.000Z

Hi @finsberg

Thank you for the prompt reply, and for your direction to the API documentation!

I have had a look into the more sophisticated method without pacing information (IMO this suits my use case more), although I have missed the documentation of the function with pacing info. I think I know the problem now: the program looks for instances "where the pacing amplitude goes from zero to something positive".

In my optical mapping data, our intensities are always positive, which is why I would definitely need to perform some kind of baseline correction for this to work. I'll try the baseline correction algorithm in ap_features, but I have in the past used pybaselines with pretty good effect, and report back.

Here's a sample of my optical map recording, just for your amusement, if you're interested in trying it out. We had a sampling rate of 478 Hz, if that information helps.

P.S.: I have not seen a publication for this library, is there a way to cite ap_features if I do use it in my work? Or should I just refer to this repo?

Answer 3 · 2024-09-29T19:12:09.000Z

Thanks for shared the dataset. I tried to chop the dataset into beats, and it seems to work if you select a low value of min_window, for example the following code

import numpy as np
import matplotlib.pyplot as plt
import ap_features as apf

data = np.load("Basler_acA720-520um__40190569__20230919_114100243_vol_denoised.npy")

t = np.arange(0, len(data))
y = apf.Beats(
    y=data,
    t=t,
    background_correction_method="full",
    chopping_options={"min_window": 5},
)

beats = y.beats

fig, ax = plt.subplots()
for beat in beats[:20]:
    ax.plot(beat.t, beat.y)
fig.savefig("beats.png")

produces the following plot

Not sure if this is what you expect?

Also thanks for mentioning pybaselines. This sounds like a great package that I probably should integrate into ap_features. I opened a new issue about that here: #132

For citation I have now added a citation file which you can use for now.

Answer 4 · 2024-10-01T13:19:50.000Z

Hi @finsberg,

thanks for getting back!

I have not seen mention of the min_window in the tutorials, only in the code documentation. Would you happen to have an explanation of why it works with a small value?

The data I sent was the raw data that one would expect to obtain from optical maps. Usually, the processing of the data would entail:

Inverting the signal
Filtering
Performing some sort of normalisation (for my case, we do pixelwise min-max normalisation, or for an ROI, this would be averaged over the ROI's x and y axes)

From that plot, I notice two things: (1) the signals have not been inverted, and (2) the splitting of the signals seem to be based on the point of upstroke (downstroke if the signal was inverted). I have used your code and modified it. This is how it looks like:

import numpy
import matplotlib.pyplot as plt
import ap_features as apf

data = numpy.load("Basler_acA720-520um__40190569__20230919_114100243_vol_denoised.npy")

# Recording params
framerate = 478  # Unit: frames per seconds
rec_duration = len(data) / framerate  # Unit: seconds
sampling_rate = 1 / framerate

time = numpy.linspace(start=0, stop=rec_duration, num=len(data))

# Invert data
signal_max = data.max()
inverted_data = -1 * (data - signal_max)

# Analyse with ap_features
y = apf.Beats(
    y=inverted_data,
    t=time,
    background_correction_method="full",
    chopping_options={
        # "extend_front": 100,
        # "extend_end": 700,
        "min_window": sampling_rate / 200},
)

beats = y.beats

fig, ax = plt.subplots()
for beat in beats:
    ax.plot(beat.t, beat.y)
    ax.set_xlabel('Time (s)')
    ax.set_ylabel('Intensity (a.u.)')
plt.show()

This is the resulting plot (cropped for the first 5 secs):

It looks like the first few AP beats were not correctly cropped, but the others looked okay, with some looking pretty good while some others have a some kind of overlap. Would you happen to have an explanation for that?

P.S.: Thanks for the citation file! Will definitely use that whenever I use ap_features for any published work!

Answer 5 · 2024-10-05T14:02:31.000Z

OK, I think one of the main issues here (which should have been better documented). Is that the values of min_window, max_window, extend_front and extend_end should be in the same units as the units of time for your data. So in you case, the following options seems to give a pretty good results

y = apf.Beats(
    y=inverted_data,
    t=time,
    background_correction_method="full",
    chopping_options={
        "extend_front": 0.05,
        "min_window": 0.05,
        "max_window": 0.2,
    },
)

Here are the first 50 beats
.

I renamed this issue to indicate that the problem is really with documenting the chopping options.