Cells failing QC due to "electrode_0_pa missing value"

Question

Cells failing QC due to "electrode_0_pa missing value"

Opened this issue 4 years ago · 13 comments

Describe the use case that is addressed by this feature.
There are currently nearly 700 IVSCC pipeline experiments that are failing cell QC with the fail tag "electrode_0_pa missing value". This fail tag is referring to one of our initial voltage clamp QC sweeps that IPFX is unable to find (stimulus name: ‘EXTPINBATH’) because the incorrect sweep was being ran at this point (stimulus name: ‘EXTPSMOKET’). This is actually not that big of a deal because these two stimulus waves are actually the same wave, but with different names to make it easy to distinguish what point we are at during the experiment simply by stimulus name, which is how IPFX identifies these sweeps.

Describe the solution you'd like
If the ‘EXTPINBATH’ sweep is not found then it needs to look for the ‘EXTPSMOKET’ sweeps and do a quick resistance calculation to ensure that the rig user actually had the pipette in the bath and then measure the baseline electrode current and compare it to the qc values to ensure that it is in range. This can be accomplished by adding a new function to qc_feature_extractor.py that will look for the last 'EXTPSMOKET' sweep before the 'EXTPCllATT' sweep and if one exists, then it should check the resistance by calling 'measure_seal' from qc_features.py, and if the seal is less than 10 MOhms then call 'measure_electrode_0' from 'qc_features.py'. This new function can be called by 'extract_electrode_0' in the try, except block when the 'EXTPINBATH' sweep is not found.

Describe alternatives you've considered
The only other alternative solution would be to manually QC all of these cells, and we would like to avoid that if possible.

Additional context
This is a high priority issue because this issue is affecting several hundred IVSCC pipeline experiments, all of which are failing QC due to the issue

Do you want to work on this issue?
I am already working on a fix for this issue and will submit a pull request as soon as I am done testing.

Answer 1 · 2021-04-20T00:19:43.000Z

Hi @ru57y34nn I think the Marmot team can fix this if you can help us figure out what is the set of affected data. Can you please provide a list of experiments, or another way to identify the data that has this issue? Thank you

Answer 2 · 2021-04-21T19:32:41.000Z

Hi @wbwakeman Yeah I can get you a list of all of the affected data, but I am curious what you and/or the Marmot team have in mind for a fix as we would like to avoid altering or duplicating the data files. Also, I have already submitted a pull request with a fix (Pull request #510, which addresses this issue) and I just wanted to make sure that you have already looked at that. Sergey had raised a couple issues in the comments of the pull request but I believe I addressed them and have been waiting on a reply. It sounds like Sergey is not completely opposed to merging the pull request but has stated that we would perhaps have to document these changes to IPFX since they deviate somewhat from the white paper, in which case I think we could do that.

Answer 3 · 2021-04-21T20:49:58.000Z

Hi @ru57y34nn The team and I have had a few conversations about this. We do not want to merge the PR because it introduces an additional code path to address a special case of the data that should not be encountered again. Some of our worst code and most painful bugs have been the result of making special-case modifications to the code base to account for issues like this. We are trying to learn from our past mistakes.

Our plan is the retain a copy of the original data files on the file system, but to create a new file with the data updated to be correct, and then process the updated file through the rest of the pipeline.

Answer 4 · 2021-04-22T17:02:13.000Z

Okay thanks @wbwakeman, I see where you are coming from and that should be fine for a solution. I will get you the list of affected data but I was thinking that it might be helpful if I also included the sweep number of the EXTPSMOKET sweep that should be relabeled as EXTPINBATH for each experiment. Let me know if that would be helpful and I will include that in the list as well, thanks.

Answer 5 · 2021-04-22T18:47:59.000Z

Thanks @ru57y34nn That would be very helpful!

Answer 6 · 2021-05-07T02:46:57.000Z

Write scrip to relabel EXTPSMOKET -> EXTPINBATH for a given sweep
Apply script to a experiments missing_e0_data.csv
Re-run experiments through the pipeline (possibly Wayne)
Circle back with Rusty about the results

Answer 7 · 2021-08-03T15:42:59.000Z

@ru57y34nn We changed and ran one of these as an example. Can you please check this file and see if it looks okay?

/allen/programs/celltypes/production/mousecelltypes/prod2826/Ephys_Roi_Result_1048215014/1048215014_1119548823_spikes.nwb

Answer 8 · 2021-08-03T17:48:54.000Z

@wbwakeman I checked the new nwb file and it now shows sweep 1 as the EXTPINBATH180424, where as this sweep was previously labeled as EXTPSMOKET180424. This definitely fixed the issue for this cell and I see that it has completed QC and feature extraction and is no longer failing due to "electrode_0_pa missing value". Looks good, thanks and let me know if you need anything else.

Answer 9 · 2021-08-03T17:55:12.000Z

Thanks @ru57y34nn . I will update and reprocess the remaining 500+ experiments in the list.

Thanks @JakeSAI for converting the files

Answer 10 · 2021-08-03T19:05:31.000Z

@ru57y34nn
Looks like most of the 579 have completed processing. One is still running. 29 failed because stimulus_duration was null

Answer 11 · 2021-08-04T18:38:45.000Z

Thanks @wbwakeman I will check these today. Do you happen to have the list of 29 that failed due to stimulus_duration was null? I'm sure I can track them down, I just figured if you had the list handy that would be easiest. Thanks

Answer 12 · 2021-08-04T18:50:37.000Z

@ru57y34nn

Answer 13 · 2021-08-05T18:52:58.000Z

@wbwakeman I have been going through and checking these and with the exception of the 29 that failed due to the stimulus_duration was null issue issue, everything else looks great. I am still looking into the null stimulus_duration issue, but it seems very familiar to another issue we had back at the beginning of the year with the stimulus_amplitude being null. I noticed that there are other experiments that contain sweeps with null stimulus_durations that are passing just fine; it seems that the null stimulus_duration is not a problem as long as the sweep is tagged in the STIMULUS_SUMMARY_V3_QUEUE json (usually with "recording stopped before completing the experiment epoch") and then removed before sweep QC. I have attached two images, one where the stimulus_duration is null but the sweep is not tagged, which causes the issue, and one where the stimulus_duration is null but the sweep is tagged, which does not cause the issue. I am still looking into why these sweeps did not get tagged even though it seems like they should have. I'll update you again one I have more information, thanks.