neurostuff/NiMARE

Implement caching for nimare.io.convert_neurosynth_to_dataset

JohannesWiesner opened this issue · 2 comments

Summary

It seems that nimare.io.convert_neurosnyth_to_dataset can take a while to finish. This makes it a little bit tedious when writing an analysis script because the user always has to wait for this function to finish until other analysis steps can be tried out. Maybe offering caching here (e.g. nimare.io.convert_neurosnyth_to_dataset(cache_dir='../path/to/cache/dir') would help?

Right now, I simply wrap the whole function inside a cached wrapper function like such, which of course also works fine:

@memory.cache
def get_nimare_dataset(databases):
    
    ds = nimare.io.convert_neurosynth_to_dataset(
        coordinates_file=databases['coordinates'],
        metadata_file=databases['metadata'],
        annotations_files=databases['features']
        )

    return ds

Thought it would be a nice addon if this would be available out-of-the-box :)

Thanks! That is a good suggestion.

I think we currently have a workaround to this. Most NiMARE's classes support save/load methods, which save the object as a pickle file, and load a pickle file as a NiMARE class.

Generally, the Neurosynth dataset is downloaded and converted into a NiMARE dataset object only once. The class could be saved with dset.save(dset_fn), keep it somewhere accessible, and load it with dset = nimare.dataset.Dataset.load(dset_fn) if it needs to be reused.

This is not as elegant as caching the function, but at least it reduces the overhead.