Produce BIDS Derivatives-compatible outputs

Question

Produce BIDS Derivatives-compatible outputs

tsalo opened this issue 4 years ago · 8 comments

We have a set of outputs of interest from AROMA, including:

Denoised 4D image, in native and standard space (possibly)
Metrics (feature scores)
Component time series
Component maps, in native and standard space (possibly)
Classifications (included with feature scores in "classification_overview.txt")
Figure

We should figure out how best to make those outputs BIDS Derivatives-compatible.

At minimum, I think we can use filenames that are basically BIDS-ish, like what I propose in ME-ICA/tedana#574. This means using entities and suffixes that match BIDS convention, minus the "source entities" from the original files (e.g., sub, ses, run).

I don't know if we want to output the classifications/metrics as a json (as in tedana) or a tsv. A tsv would be easier to read...

Answer 1 · 2020-11-08T23:06:36.000Z

I'm currently leaning toward the following outputs:

desc-ICA_decomposition.json
desc-ICA_mixing.tsv
desc-ICA_components.nii.gz
desc-ICAGammaThresholded_components.nii.gz
desc-smoothAROMAnonaggr_bold.nii.gz
AROMAnoiseICs.csv (if we keep them)
desc-AROMA_metrics.tsv (instead of the json)
desc-AROMA_metrics.json (metadata about metrics)
aroma.svg

Answer 2 · 2020-11-09T08:16:40.000Z

What about the folder structure? My guess is the users will run it in their data folder, say sub-001/ses-01/func/. We should create a derivatives folder or something right?

Answer 3 · 2020-11-09T13:40:36.000Z

Since we're planning to use this as a node in a BIDS App (e.g., fMRIPrep), I think we should dump things into an arbitrarily-named output folder. We can always develop a BIDS App version of the CLI at some point, if there's a demand. It's in the BIDS App version where I think we'd want to try to extract subject and session information, and then develop a folder and filename structure around that.

Answer 4 · 2020-11-10T07:08:46.000Z

Since we're planning to use this as a node in a BIDS App (e.g., fMRIPrep), I think we should dump things into an arbitrarily-named output folder.

Yes, it is trivial then to have some manager code that transfers everything to the right position and applies bids-derivatives naming.

I'm currently leaning toward the following outputs:

desc-ICA_decomposition.json

desc-ICA_mixing.tsv

desc-ICA_components.nii.gz

desc-ICAGammaThresholded_components.nii.gz

desc-smoothAROMAnonaggr_bold.nii.gz

AROMAnoiseICs.csv (if we keep them)

desc-AROMA_metrics.tsv (instead of the json)

desc-AROMA_metrics.json (metadata about metrics)

aroma.svg

Yes, don't make it overly complicated - this should do!

Answer 5 · 2020-11-11T09:59:17.000Z

@tsalo @oesteban your comments sound good to me!

Answer 6 · 2021-02-07T15:30:36.000Z

Here is what I'm thinking the contents of the files should look like (cross-posted and lightly adapted from ME-ICA/tedana#649):

desc-ICA_mixing.tsv (required)

	ica_00	ica_01	ica_02	ica_03	ica_04
0	1.96128	0.715816	1.81138	2.75372	1.84941
1	0.15217	-0.85096	-0.101651	0.147915	-0.363037
2	0.326455	-0.392816	1.1161	0.450553	0.580335
3	0.0141179	-0.871712	1.02556	0.678525	-0.0628587
4	-0.0459112	-1.02132	1.10107	0.306312	-0.400553

desc-ICA_decomposition.json (required)

{
    "Method": "Independent components analysis with MELODIC ICA algorithm implemented by FSL. Components are sorted by variance explained in descending order.",
    "ica_00": {
        "Description": "ICA fit to dimensionally-reduced data.",
        "Method": "AROMA"
    },
    "ica_01": {
        "Description": "ICA fit to dimensionally-reduced data.",
        "Method": "AROMA"
    },
    "ica_02": {
        "Description": "ICA fit to dimensionally-reduced data.",
        "Method": "AROMA"
    },
    "ica_03": {
        "Description": "ICA fit to dimensionally-reduced data.",
        "Method": "AROMA"
    },
    "ica_04": {
        "Description": "ICA fit to dimensionally-reduced data.",
        "Method": "AROMA"
    }
}

desc-AROMA_metrics.tsv

Component	max_RP_corr	HFC	edge_fract	csf_fract	classification	rationale
ica_00	0.2	0.8163	0.507241	0.00403574	rejected	hfc
ica_01	0.43	0.1357	0.569411	0.00297853	accepted	n/a
ica_02	0.7	0.4905	0.33585	0.00195625	rejected	hyperplane
ica_03	0.1	0.7926	0.517262	0.00306524	accepted	n/a
ica_04	0.25	0.2831	0.32673	0.00231219	accepted	n/a

desc-AROMA_metrics.json

{
    "Component": {
        "Description": "The unique identifier of each component. This identifier matches column names in the mixing matrix TSV file.",
        "LongName": "Component identifier"
    },
    "classification": {
        "Description": "Classification from the classification procedure.",
        "Levels": {
            "accepted": "A component determined to be unrelated to motion. Included in denoised data.",
            "ignored": "A low-variance component included in denoised data.",
            "rejected": "A motion-related component excluded from denoised data."
        },
        "LongName": "Component classification"
    },
    "max_RP_corr": {
        "Description": "The maximum robust correlation of each component time series with a model of 72 realignment parameters.",
        "LongName": "Maximum motion parameter correlation"
    },
    "HFC": {
        "Description": "The frequency, as fraction of the Nyquist frequency, at which the higher and lower frequencies explain half of the total power between 0.01Hz and Nyquist.",
        "LongName": "High-frequency content"
    },
}

Answer 7 · 2021-02-12T20:30:56.000Z

@tsalo This makes sense to me. There's no clear connection between metrics.tsv and mixing.tsv, so if this is something you'll expect to make a standard derivative in the future, I could see that as being something people would ask about. But I'm not sure it's worth trying to predict how that conversation will go before it's happened.

Answer 8 · 2021-02-12T20:38:56.000Z

@effigies Thanks! You're right about the lack of connection. I'm hoping that we can come up with a way to fix that in the long term, but I'm glad this is good for the mean time.