AllenInstitute/ipfx

Cells failing QC due to "electrode_0_pa missing value"

Opened this issue · 13 comments

Describe the use case that is addressed by this feature.
There are currently nearly 700 IVSCC pipeline experiments that are failing cell QC with the fail tag "electrode_0_pa missing value". This fail tag is referring to one of our initial voltage clamp QC sweeps that IPFX is unable to find (stimulus name: ‘EXTPINBATH’) because the incorrect sweep was being ran at this point (stimulus name: ‘EXTPSMOKET’). This is actually not that big of a deal because these two stimulus waves are actually the same wave, but with different names to make it easy to distinguish what point we are at during the experiment simply by stimulus name, which is how IPFX identifies these sweeps.

Describe the solution you'd like
If the ‘EXTPINBATH’ sweep is not found then it needs to look for the ‘EXTPSMOKET’ sweeps and do a quick resistance calculation to ensure that the rig user actually had the pipette in the bath and then measure the baseline electrode current and compare it to the qc values to ensure that it is in range. This can be accomplished by adding a new function to qc_feature_extractor.py that will look for the last 'EXTPSMOKET' sweep before the 'EXTPCllATT' sweep and if one exists, then it should check the resistance by calling 'measure_seal' from qc_features.py, and if the seal is less than 10 MOhms then call 'measure_electrode_0' from 'qc_features.py'. This new function can be called by 'extract_electrode_0' in the try, except block when the 'EXTPINBATH' sweep is not found.

Describe alternatives you've considered
The only other alternative solution would be to manually QC all of these cells, and we would like to avoid that if possible.

Additional context
This is a high priority issue because this issue is affecting several hundred IVSCC pipeline experiments, all of which are failing QC due to the issue

Do you want to work on this issue?
I am already working on a fix for this issue and will submit a pull request as soon as I am done testing.

Hi @ru57y34nn I think the Marmot team can fix this if you can help us figure out what is the set of affected data. Can you please provide a list of experiments, or another way to identify the data that has this issue? Thank you

Hi @wbwakeman Yeah I can get you a list of all of the affected data, but I am curious what you and/or the Marmot team have in mind for a fix as we would like to avoid altering or duplicating the data files. Also, I have already submitted a pull request with a fix (Pull request #510, which addresses this issue) and I just wanted to make sure that you have already looked at that. Sergey had raised a couple issues in the comments of the pull request but I believe I addressed them and have been waiting on a reply. It sounds like Sergey is not completely opposed to merging the pull request but has stated that we would perhaps have to document these changes to IPFX since they deviate somewhat from the white paper, in which case I think we could do that.

Hi @ru57y34nn The team and I have had a few conversations about this. We do not want to merge the PR because it introduces an additional code path to address a special case of the data that should not be encountered again. Some of our worst code and most painful bugs have been the result of making special-case modifications to the code base to account for issues like this. We are trying to learn from our past mistakes.

Our plan is the retain a copy of the original data files on the file system, but to create a new file with the data updated to be correct, and then process the updated file through the rest of the pipeline.

Okay thanks @wbwakeman, I see where you are coming from and that should be fine for a solution. I will get you the list of affected data but I was thinking that it might be helpful if I also included the sweep number of the EXTPSMOKET sweep that should be relabeled as EXTPINBATH for each experiment. Let me know if that would be helpful and I will include that in the list as well, thanks.

Thanks @ru57y34nn That would be very helpful!

  • Write scrip to relabel EXTPSMOKET -> EXTPINBATH for a given sweep
  • Apply script to a experiments missing_e0_data.csv
  • Re-run experiments through the pipeline (possibly Wayne)
  • Circle back with Rusty about the results

@ru57y34nn We changed and ran one of these as an example. Can you please check this file and see if it looks okay?

/allen/programs/celltypes/production/mousecelltypes/prod2826/Ephys_Roi_Result_1048215014/1048215014_1119548823_spikes.nwb

@wbwakeman I checked the new nwb file and it now shows sweep 1 as the EXTPINBATH180424, where as this sweep was previously labeled as EXTPSMOKET180424. This definitely fixed the issue for this cell and I see that it has completed QC and feature extraction and is no longer failing due to "electrode_0_pa missing value". Looks good, thanks and let me know if you need anything else.

Thanks @ru57y34nn . I will update and reprocess the remaining 500+ experiments in the list.

Thanks @JakeSAI for converting the files

@ru57y34nn
Looks like most of the 579 have completed processing. One is still running. 29 failed because stimulus_duration was null

Thanks @wbwakeman I will check these today. Do you happen to have the list of 29 that failed due to stimulus_duration was null? I'm sure I can track them down, I just figured if you had the list handy that would be easiest. Thanks

@ru57y34nn

1051143316
1052510155
1055221318
1058212878
1059527503
1062296205
1062574935
1062768364
1062774362
1069242291
1069527000
1071658319
1071675546
1075277114
1075901296
1076113329
1076279677
1077698184
1077731028
1079007610
1079017033
1079070872
1079081376
1079091494
1079571190
1080300049
1080315249
1080372178
1085140564

@wbwakeman I have been going through and checking these and with the exception of the 29 that failed due to the stimulus_duration was null issue issue, everything else looks great. I am still looking into the null stimulus_duration issue, but it seems very familiar to another issue we had back at the beginning of the year with the stimulus_amplitude being null. I noticed that there are other experiments that contain sweeps with null stimulus_durations that are passing just fine; it seems that the null stimulus_duration is not a problem as long as the sweep is tagged in the STIMULUS_SUMMARY_V3_QUEUE json (usually with "recording stopped before completing the experiment epoch") and then removed before sweep QC. I have attached two images, one where the stimulus_duration is null but the sweep is not tagged, which causes the issue, and one where the stimulus_duration is null but the sweep is tagged, which does not cause the issue. I am still looking into why these sweeps did not get tagged even though it seems like they should have. I'll update you again one I have more information, thanks.
stimulus_null_fail
stimulus_null_okay