ME-ICA/tedana

Nonlinear Curve fitting is slow, and other anatomical T2star fitting issues

dowdlelt opened this issue ยท 1 comments

Summary

t2smap is a nice tool for estimating T2* parameters, however, for anatomical data it falls short in several ways. Curve fit is slow, you have to split up the (usually) single 4D data file just to pass the data in and there is no feedback on how long it is expected to take.

Additional Detail

Its a minor use case, but its mine. For my data, it would take ~60 hours or more to do the curve fitting and I find that offensive. And I think it may be worth speeding up the curve fit in general. I had put in the progress bar on a branch a while back and never opened a pull request. Want to fix that.

Next Steps

I've made modifications to t2smap that:
use joblib Parallel and delayed to run curve fitting - progress bars cleanly show how long it will take, and testing confirms this is an embarrassingly parallel problem - 24 cores provides a ~20x speed up.
tqdm over the voxel loop makes it easy to see progress. Likely we should print what each internal loop is doing (looping over number of good echoes). For anatomical data with many echoes, you get a lot of progress bars back to back and that can be confusing
image
Probably want to silence covariance warnings ahead of time?
Modified the input parser such that, if a dataset is a 4D file, with timeseries length = n_echoes, then treat that like it was multiple seperate files on the command line.

So I plan to open a pull request for this - but I wanted to go ahead and spit this out there while I finished up editing the doc strings.

๐Ÿ‘ for using joblib. Faster is definitely better.

  • I vaguely recall there being an issue where we had to lock the number of jobs to 1 in tedana though. Does that sound familiar?

๐Ÿ‘ for using tqdm frequently.

  • I actually started using it for curve fit in #1044:
    for voxel in tqdm(voxel_idx, desc=f"{echo_num}-echo monoexponential"):

    Note that you can include progress bar names to keep track of what's being done

๐Ÿ‘Ž on supporting 4D files. I can't speak to other data standards, but BIDS organizes multi-echo anatomical data into echo-wise files. While users can set the EchoTime metadata field as an array, that is only supported in specific cases where echo times vary and volume timing matters (e.g., a variable echo time, rather than multi-echo, fMRI scan). That said, I don't feel super strongly about this. I just don't want to add flexibility to support things that aren't recommended by established data standard (esp. BIDS). That's part of why I wanted to drop z-concatenated data way back when.