AllenInstitute/ipfx

Correctly infer stimulus/experiment epochs from traces with Nan values

Closed this issue · 30 comments

Currently the "Recording stopped before completing the experiment epoch" tag is preventing sweeps from being added to the "sweep_states" list in EPHYS_QC_V3_QUEUE_output.json. Need to update sweep_states with the tag.

Touch base with Rusty about his expectations for these tags and qc calls at the level of 'sweep' and 'cell'.

Example: /allen/programs/celltypes/production/mousecelltypes/prod3052/Ephys_Roi_Result_1086833106/202103081352_EPHYS_QC_V3_QUEUE_1086833106_output.json shows that the cell did not fail ephys qc (“failed_qc”: false), and all of the sweeps are now in LIMS and they are all “auto passed”, but in the ephys qc json there is only a single sweep that shows up in the “sweep_states” list:

  "sweep_states": [
    {
      "sweep_number": 54,
      "passed": true,
      "reasons": []
    }
  ]

Looking closer at /allen/programs/celltypes/production/mousecelltypes/prod3052/Ephys_Roi_Result_1086833106/202102261501_EPHYS_NWB_STIMULUS_SUMMARY_V3_QUEUE_1086833106_output.json, sweep 54 is the only sweep without the tag “Recording stopped before completing the experiment epoch”. This tag should not prevent sweeps from passing ephys QC and then feature extraction, since most sweeps will have this tag because they auto-terminate the sweeps after the stimulus epoch and once the cell returns to baseline in order to save time during experiments.

Rusty checked some other experiments and confirms that only sweeps without “tags” in the EPHYS_NWB _STIMULUS_SUMMARY_V3_QUEUE_output.json are making it into the “sweep_states” list in the EPHYS_QC_V3_QUEUE_output.json.

Would like some clarity as to how all of the sweeps are showing up in LIMS as “auto passed” but they are omitted from the “sweep_states” of the EPHYS_QC_V3_QUEUE_ouput.json because of tags that are not necessarily indicating an issue.

Just a thought here that for “Recording stopped before completing the experiment epoch” in particular, I'm not sure just ignoring the tag is the answer. The QC checks currently assume the full epoch is present to calculate delta_vm. We may need to either drop that check (since it's essentially duplicating a MIES QC step), or tweak it to be more compatible with the MIES approach (maybe just need to adjust the time interval, POSTSTIM_STABILITY_EPOCH)

I have also noticed that if all of the sweeps in an experiment are tagged in the EPHYS_NWB_STIMULUS_SUMMARY_V3_QUEUE_output.json with either the tag "Recording stopped before completing the experiment epoch" or "Stimulus contains NaN values" then no sweeps will make it into the EPHYS_QC_V3_QUEUE_input.json and the cell will fail qc in the EPHYS_QC_V3_QUEUE_output.json with the fail tags "No sweep states available"; however, if at least one sweep is not tagged then that sweep, and any other untagged sweeps will make it into the EPHYS_QC_V3_QUEUE_input.json and the cell will move on to sweep qc.

If all sweeps that made it into the EPHYS_QC_V3_QUEUE_input.json fail sweep qc then the cell will fail qc in the EPHYS_QC_V3_QUEUE_ouput.json with the fail tags "No current clamps sweeps passed QC"; however, if at least one sweep passed sweep qc then the cell will pass sweep qc and move onto feature extraction.

It appears that feature extraction will only exclude cells that have explicitly failed sweep qc in the EPHYS_QC_V3_QUEUE_output.json ("passed": false), so any sweeps that did not make it into sweep qc because of either of the tags "Recording stopped before completing the experiment epoch" or "Stimulus contains NaN values", then those sweeps would not have been failed during sweep qc and so they will be used in feature extraction. It appears that sweeps that were omitted from sweep qc due to these tags are then used in feature extraction, even though they have not gone through sweep qc.

See this experiment as an example where only sweeps 51 and 54 were not tagged with "Recording stopped before completing the experiment epoch" or "Stimulus contains NaN values".

  • /allen/programs/celltypes/production/mousecelltypes/prod3091/Ephys_Roi_Result_1093425613/

Both sweeps passed sweep qc in the EPHYS_QC_V3_QUEUE_ouput.json so the cell moved to feature extraction and all sweeps were then used in feature extraction, even though most sweeps did not make it into sweep qc (only sweeps 18 and 54 made it to sweep qc and passed).

We need to find a resolution to this issue soon because we have current IVSCC pipeline experiments failing sweep qc because all of the sweeps were tagged, and we have features being extracted from sweeps that have not gone through sweep qc. I can start digging into this issue but it would be nice to have some input from the developers about this as well.

I think I have figured out the issue with the "Recording stopped before completing the experiment epoch" tag. The tag will be applied by the check_sweep_integrity function in qc_feature_extractor.py if the ending index of the recording epoch is less than the ending index of the experiment epoch. Many of our recent experiments have most or all of their sweeps tagged with this because the ending index of the recording epoch is less than the ending index of the experiment epoch.

The end of the experiment epoch is defined as stim_end_index + POSTSTIM_STABILITY_EPOCH * sampling_rate, and the issue is stemming from a bug in how the stim_end_index is determined.

epochs.py contains the get_stim_epoch function which determines the start and end indices of the stimulus epoch. The issue is that it does this by using the stimulus trace and getting the differences between the adjacent points in the array with numpy.diff and then looking for indices of nonzero differences with numpy.flatnonzero.

di = np.diff(i)
di_idx = np.flatnonzero(di) # != 0

but there are NaN values present at the end of the trace that are not dropped by the numpy.flatnonzero function, which is causing the indices of all of those NaN points to be included in the di_idx array, which is causing the stim_end_index to be the same as the ending index of the entire sweep.

epochs_example

As can be seen in the above image, the ending index of the stimulus epoch is basically the same as the ending index of the entire sweep, which at a sampling rate of 200 KHz puts the end of the stimulus at 3.6 seconds, which is clearly not the case. As well, this also puts the end of the experiment epoch at 4.1 seconds, which is also clearly not the case here. I was able to correct this issue by first dropping the NaN values from the end of the array before calling numpy.flatnonzero:

di = np.diff(i)
di = di[~np.isnan(di)] # != NaN
di_idx = np.flatnonzero(di) # != 0

After adding this line of code I am now able to get the correct stim_end_index and the correct experiment_end_index. This should largely fix the issue of sweeps being incorrectly tagged with "Recording stopped before completing the experiment epoch" and from being omitted from sweep qc.

epochs_example_fixed

Now the end of the stimulus epoch is at 1.6 seconds and the recording epoch no longer ends before the experiment epoch

I will submit a pull request with this fix for the issue.

@ru57y34nn nice work identifying the issue and finding a fix! That seems fine to me as an initial patch for the issue, but I also wanted to suggest that we more systematically clarify the IPFX policy on missing data in sweeps in light of this.
This sort of issue is exactly what I was worried about in discussion with @MatthewAitken regarding #503 - that there might be usable sweeps terminated only slightly early with partial NaN fill, which would be discarded by the overall NaN check there (or cause other weird behavior). We missed it then because the examples we looked at were all or almost all NaN, not usable.
I'd suggest we take one of two policies:

  1. Sweeps contain only actually recorded data, with missing data indicated by nan (stim or response) or zero (response) truncated when loading the sweep. This is currently done for zeros but not nan (
    recording_end_idx = nonzero[-1] + 1
    ) and could be easily extended. This also makes the 'recording' epoch kind of redundant, so that might want to be revisited.
  2. Sweeps contain all data in the file, with the actual recording identified by the epoch, and analysis ignoring missing data where needed. In this case I'd suggest we reverse the current behavior for zeros for consistency (introduced in #384), and perhaps standardize the representation of missing data on load (to all NaN probably, across both stim and response).

@tmchartrand Thanks for the reply and the additional information here. I went ahead and pushed up another commit to handle the 'Stimulus contains NaN values' tag for now by just looking for NaN values during the stimulus epoch rather than across the entire sweep. This should fix the issue with so many of our sweeps being tagged and removed just because they contain some NaN values at the end of the stimulus sweep, but it will still tag sweeps that contain NaN values during the stimulus epoch of the stimulus trace.
As far as the policy options you mentioned, I am in favor of #1 and just dropping NaN values from sweeps when loading them.
NaN values in the stimulus trace can indicate one of two things:

  1. That the recording was stopped early because the post-stimulus baseline returned to within +/- 1mV of the initial baseline, but this is the case only when the stimulus epoch was completed.
  2. Or they indicate that the pre-stimulus baseline check failed and the recording was stopped, which will result in the stimulus epoch containing NaN values and the stimulus epoch will not be completed.
    Either way we can drop the NaN values and determine if the sweep should pass or fail just based on the actual data points.

@ru57y34nn I think there is a 3rd option for NaN values; ramp stimuli typically have NaN for a brief period at the end of the ramp, before the post pulse baseline.

@ru57y34nn #508 (comment) Actually, it's not NaNs, It's zeros. [We had to cheat to get the ITC hardware to change the DA during a sweep.]

image

@timjarsky interesting to know about the ramps, has it always been that way? IPFX currently skips all post-stim checks for ramp stimuli, but could probably adapt to use that baseline interval.

@timjarsky another critical piece of info that will probably be needed after the first-pass fix goes in here is what is the minimum post-stim interval recorded by MIES (and pre-stim also if that has changed, although i think not).

t-b commented

Can someone upload a NWB file with NaNs in the sweep to the FTP? I'm curious to see why it has NaNs in the first place. For ITC hardware the acquired data is integer so we really can't have NaN in this step.

@t-b I just uploaded NWB file Vip-IRES-Cre;Ai14-572977.04.10.02.nwb. All of the X1PS_SubThresh, X3LP_Rheo, X4PS_SupraThresh, and X6SP_Rheo sweeps are showing up with NaN's at the end of the stimulus trace because the recording was stopped before the end of the full trace.

t-b commented

@ru57y34nn Thanks for the file. Of course there can be NaNs in the data, we are doing that since a long time.

I've taken the liberty to do some more digging and created the following graph for some QC related labnotebook entries for this file.

grafik

You see horizontally the sweep number and vertically Sweep and SetQC (0/1) results. The top vertical axis is the stimulus set acquisition cycle ID. Whenever that changes you have a new analysis function run. The dashed squares are there to guide the eye only.

Now you see that all analysis functions except Rheobase have a passing Set QC.

And one thing to notice is also that the first DAScale, sweep 5 to 12 also have a passing Set QC. But if you look at the data you see that it always ended early. That is actually by design as it stops immediately if the postpulse baseline is passing. See https://alleninstitute.github.io/MIES/file/_m_i_e_s___analysis_functions___patch_seq_8ipf.html?highlight=psq_dascale#_CPPv411PSQ_DAScale6stringP19AnalysisFunction_V3 and click on the flowchart (and there, left bottom).

The early stop when baselineQC is passing is done for various analysis functions. I found the following introduction dates:

IIRC this was done to speedup acquisition.

My conclusion:
Having NaN in the acquired sweep does not imply any QC results.

@timjarsky interesting to know about the ramps, has it always been that way? IPFX currently skips all post-stim checks for ramp stimuli, but could probably adapt to use that baseline interval.

@tmchartrand it's been that way since analysis functions were implemented for the ramp stimulus. Prior to that, rig operators were manually terminating the sweep.

I'm don't know if this gets at your question, but the duration of the period with zero values (at the end of the ramp, before the baseline) is variable (hardware/OS controlled).

@timjarsky another critical piece of info that will probably be needed after the first-pass fix goes in here is what is the minimum post-stim interval recorded by MIES (and pre-stim also if that has changed, although i think not).

@tmchartrand the min post-stimulus baseline should be 1 second. The first 500 ms post stim chunk is ignored, the second 500 ms chunk is evaluated for return to pre-stimulus baseline V (this is actually the target V for autobias). If the second chunk passes, the sweep is terminated.

*there is a small amount of time after the chunk passes before the sweep terminates. In this time the voltage may vary from baseline V and MIES and IPFX may disagree!

@wbwakeman I have opened PR # 511 which addresses this issue and I was hoping we could get someone to take a look at it because this is a bug that is continuing to cause most of our data to fail cell level qc because virtually every sweep is being tagged with either or both tags, "Recording stopped before completing experiment epoch" and "Stimulus contains NaN values". Both of these tags are being incorrectly applied to most of the data because of a bug in how the ending index of the stimulus epoch was being found without first dropping the np.nan values from the end of the stimulus array. The fix I have offered in PR #511 should suffice as a patch for now but we really should be just dropping all np.nan points from the stimulus arrays upon loading sweeps into IPFX (see above comment by @t-b as well for further clarification on the presence of np.nan values in stimulus arrays). Let me know if you have any questions or concerns about my fix in PR #511, thanks!

@ru57y34nn nice job chasing the issue. Your fix should work, but we must move towards getting the epoch information from the nwb2 file, rather than inferring them from traces. Do you know if all nwb2 files produced by pipeline save epoch info into the TimeIntervals table?

@sgratiy I'm not sure about the TimeIntervals table but I have found the epoch information for each sweep using the get_value method of the notebook in the protected _data attribute of the EphysDataSet object as such:

dataset._data.notebook.get_value("Epochs", sweep_num, None)

@sgratiy @ru57y34nn MIES NWBs do not have epoch info in the TimeIntervals table BUT it is the top issue on @t-b U01 project. So, I expect the pipeline will be generating NWB with epoch information in the TimeIntervals table in about a month.

We (Thomas and Technology (Wayne)) have in the past regenerated NWB files from old experiments so it is possible to add the epoch information for old experiments. I'm not suggesting this is the best path forward but it is an option and if we did go that way we'd need to ensure it was done correctly in cases where the NWB no longer matches the PXP (e.g., #510)

@sgratiy there's a lot of important discussion on this issue that I'd like to avoid getting hidden by moving on to the question of saved vs inferred epoch info. Perhaps you could start a new issue for any discussion of adding support for saved epoch info, and rename this issue something like "correctly infer recording/experiment epochs in current NWB2 files"?

Regardless of any future support for saved epoch info, we may need to continue supporting inferred epochs for e.g. files converted from abf/dat etc. I think we should resolve this issue by more clearly defining that approach to inferred epochs as I outlined above, beyond Rusty's initial fix PR. I can perhaps submit a draft PR in that direction, but I'd appreciate input regarding the two policies I mentioned, especially with an eye towards compatibility with the saved epoch info in the future

  1. Sweeps contain only actually recorded data, with missing data indicated by nan (stim or response) or zero (response) truncated when loading the sweep. This is currently done for zeros but not nan and could be easily extended:
    recording_end_idx = nonzero[-1] + 1

(Note that one complication here is the midsweep missing data for ramps, encoded as zeros as mentioned by Tim above, which currently is not dealt with properly and would be tricky to take this approach with)
or:

  1. Sweeps contain all data in the file, with the actual recording identified by the epoch, and analysis ignoring missing data where needed. In this case I'd suggest we reverse the current behavior for zeros for consistency (introduced in #384), and perhaps standardize the representation of missing data on load (to all NaN probably, across both stim and response).

@tmchartrand as you suggested I have renamed this issue and created a new one for reading saved epochs.

Since IPFX skips post stim check on ramps, zeros after ramp should not be a problem. Moreover, if we just truncate the trailing zeros and Nans, then you still will have zeros in mid sweep. Only the trailing zeros would be removed. I prefer policy 1 because we would deal with shorter traces reducing memory usage. Memory usage was a issue earlier prompting @kasbaker using LRU cache.
If you could submit a PR this would be very helpful. We could probably eliminate "sweep" epoch, but keep "recording" epoch to mean now all actually recorded data and set it as a default epoch.

@sgratiy yes, the zeros are not a problem for the analysis as currently run, but it would still be nice to avoid them if possible for consistency. Either removing them (and dealing with the resulting variable dt) or replacing with NaN would seem preferable to me, but would take a bit of extra work and are probably not worth the effort. I may at least file a separate issue as a placeholder for the future.
Also, to be clear, I really don't think memory usage is a factor between these two policies. The need for the cache was from a load being called repeatedly, affecting processing time not memory. It's possible that for some recent files with many terminated sweeps there would be a performance difference from truncating early, but in most cases I imagine the difference should be minor.

@tmchartrand thanks for clarifying about the memory consideration.
Regarding dropping zeros in the middle of the sweep: I am afraid that dropping sample points from the middle of the sweep may break lots of functionality. Many functions rely on the assumption of constant dt.
Yes, could you please file a separate issue and point out to your preferred solution, because now I am not sure which of your two policies you prefer.

Limit the scope of the issue to resolving the issue with the "Recording stopped before completing the experiment epoch" tag
The other issue with the auto_passed sweeps should be addressed as part of #516

  • Update PR #511 per review comments

Validation:

  • For the experiment in question plot all sweeps tagged "Recording stopped before completing the experiment epoch" and to confirm that these sweeps were legitimately tagged.

These two PR options I submitted should be ready to go and can probably take the place of @ru57y34nn 's current PR (maybe test this out and see if you agree if you're able, Rusty).

I tend to lean towards using the recording epoch (option 2 from those I proposed above), even though it's slightly more complex. Most of the functionality was already there, just not being used properly, and I think this version is more likely to align well with future changes to use epoch info directly from the NWB itself, rather than inferring epochs.

One bit that we may want to add here also is an integration test using one of the new-MIES-version NWB files.

@tmchartrand I finally got a chance to test out your two pull requests (519 and 520) and I think that either option will work for us as they both produce the correct results, but I agree with you and am also leaning towards option 2 as the better option going forward.

@ru57y34nn @tmchartrand I favor option 2 because it doesn't alter the data.

@sgratiy sounds like we have some consensus on this, could you take a look at #520? And oerhaps @gouwens should take a look also.

I have attached an excel doc with 565 cells that have the issue of incorrectly getting the stimulus/experiment epochs from the traces, which have resulted many sweeps being incorrectly tagged with "Recording stopped before completing the experiment epoch", and "Stimulus contains NaN values". I have also included extra columns with additional information, such as the sweep numbers that have been tagged (in columns with headers named after the tags) and the pass/fail status for QC and FX for reference and comparison after we re-run these.
This data set should constitute a significant portion of the cells with this issue, although there are likely more that are currently held up by #509 (we should fix that issue before testing those for this issue).

We should also be able to use this data set to test #516 as many of these cells, although they have a 'passed' status for FX, failed FX for at least one of longsquares, shortsquares, or ramps FX, and the fail messages for these are also included in the excel doc. Most of these FX fail message indicate issues caused either by the incorrect epochs or by failed sweeps getting through that should have been dropped from FX. Hopefully fixing this issue and issue 516 will fix most of these issues. Please let me know anyone has any questions about this dataset.
Issue508data.xlsx

@wbwakeman this should be fully addressed by the merged PR and follow-up fixes, I think we can close.

Resolved by #520