AllenInstitute/ipfx

Implement graceful degradation for feature extractor

Closed this issue · 9 comments

Determine which strategy to use based on issue #398
(whole bunch of decorators, or rework computation graph)
update documentation

Acceptance criteria: able to run "failed" cells from June release set to generate NWB files with spikes via ipfx run_pipeline
And report how it failed

Hoping this will resolve #278, #324, #330, #394

Hey @gouwens and @tmchartrand,

We need to make a decision about the strategy to take when resolving these issues. This decision is a choice between:

  1. injecting error handling throughout the existing feature extraction & plotting "glue" code
  2. registering the feature extraction & plotting code to a computation graph and handling errors/inapplicable features through that graph.

In both cases, we would:

  • avoid changing the actual plotting and feature extraction code. The changes would only be to the glue code that runs feature extraction and plotting on the sweepset/dataset-scale.
  • avoid removing any existing functionality, though we might deprecate some.

Overall, I think that option 2 is a cleaner and more extensible solution. On the other hand, option 1 does not involve substantial architectural change to the current system.

What do you guys think? Are there many users who have written extensions depending on the existing glue code?

For more details and some code samples, please see this writeup

Thanks for the write-up. I'm not opposed to option 2 right off the bat. But I am not 100% sure I'm seeing the benefits right away. As you say in the write-up, right now we are implementing that graph as coded in those various feature processing scripts. I'm not sure how adding another layer of abstraction (the Node/Graph classes) is going to help. I know it's a simple example, but as written, I might have no idea what the graph computation is doing because you have to go back to all the Node class definitions to see which functions are actually being called. You do name them similarly, but it kind of seems like extra code for no particular reason? Like, why not just call the functions and do the validation/error as needed interspersed with the calls?

Probably some of that could be fixed - like you could have a more general node class that you pass the function to as an argument (so it's clearer when you're putting the graph together what you're doing). But really all you're doing then is keeping track of success/failed status. Maybe this is the best way of tracking that status, though - I'm not sure.

Another thing I worry about is handling more complex inputs and outputs. Some functions may return tuples of values, and later functions that depend on them may only use some of those values. I think this system could end up hiding some of what is actually being used if everything has to be passed as "value" values.

I wonder if the real issue is that the pipeline code just needs to be reorganized/refactored. I think the pipeline code was originally written to (1) calculate all the values we need from a cell to publish it and (2) slam those results in a database that expects everything to be there. That's why it's currently monolithic and succeeds or fails in an all-or-none way. Option 2 is one way of refactoring it that would probably force some clean-up of the logic (which is a benefit), but I'm worried about adding boilerplate/overhead that'll make it harder to maintain. And maybe you could have gotten most of those benefits by refactoring in another way?

Regarding code that "depends on the existing glue code," do you mind specifying which code you consider glue? Like, I don't think I really call extract_cell_features directly, but I very frequently use the objects like LongSquareAnalysis used within that function. Those objects are kind of "glue" code themselves, depending on how you think about it.

Thanks for starting on this @NileGraddis , and for reaching out. I mostly agree with what Nathan said above. I'd also lean towards the first option, away from extra abstraction. Definitely agree that some refactoring is probably in order either way, and could help the first option come out more cleanly.
Even though there certainly are a few levels in the existing computation graph, I don't see supporting that complexity as too important for future developments. Even capturing the full chain of failures in the existing nested computations doesn't seem that critical, although it would certainly be nice ( I could imagine a higher-level feature with a failure tag/message that simply says "failed due to failure in nested function X" or "failed due to sweep-level failures").
I don't think I have or know of any existing code that would be too disrupted by changes in the "glue" either way, although I agree that that's certainly hard to cleanly distinguish in the existing code. I'll try to think in a bit more depth about these solutions and let you know if I have more thoughts, or would be glad to join a brief call at some point soon if that's helpful.

Yeah, I'd also be happy to talk about it on a call as per @tmchartrand 's suggestion, if that'd be easier.

alright, I'll go ahead and schedule that call (and thank you both for the very fair feedback).

Also, one thing I'd like to point out (doing so here because it requires a code sample), is that we can handle boilerplate removal and compound outputs without too much trouble in option 2. Like this, for instance.

Of course, this does not address the larger concerns you've raised about straightforwardness, readability, and the preservation of the existing "glue" interfaces.

Notes from meeting call:

  • Nathan, Tom mainly using stimulus analysis classes, less data set feature extractor & similar "plot everything" code
  • need to check what components Luke C & Stephanie S are calling
  • it would be useful to be able to output more intermediate results, such as info about specialized sweeps
  • more specific error messages in feature extraction might help when writing specialized qc criteria for newer experiments
  • it would be helpful to have a full example of option 2, calculating real features on real data

Overall, it seems like there is an intermediate solution where we solve the problem at the stimulus type level, but don't dig into the stimulus analysis classes. We might apply either method to this intermediate solution.

This is a file that fails QC, but for which we want to generate a NWB2 file with spikes. Current system does not create an output.nwb file

"/allen/programs/celltypes/production/mousecelltypes/prod968/Ephys_Roi_Result_777417625/nwb2_Vip-IRES-Cre;Ai14-424614.04.01.01.nwb"