Produce BIDS Derivatives-compatible outputs
tsalo opened this issue · 8 comments
We have a set of outputs of interest from AROMA, including:
- Denoised 4D image, in native and standard space (possibly)
- Metrics (feature scores)
- Component time series
- Component maps, in native and standard space (possibly)
- Classifications (included with feature scores in "classification_overview.txt")
- Figure
We should figure out how best to make those outputs BIDS Derivatives-compatible.
At minimum, I think we can use filenames that are basically BIDS-ish, like what I propose in ME-ICA/tedana#574. This means using entities and suffixes that match BIDS convention, minus the "source entities" from the original files (e.g., sub
, ses
, run
).
I don't know if we want to output the classifications/metrics as a json (as in tedana
) or a tsv. A tsv would be easier to read...
I'm currently leaning toward the following outputs:
- desc-ICA_decomposition.json
- desc-ICA_mixing.tsv
- desc-ICA_components.nii.gz
- desc-ICAGammaThresholded_components.nii.gz
- desc-smoothAROMAnonaggr_bold.nii.gz
- AROMAnoiseICs.csv (if we keep them)
- desc-AROMA_metrics.tsv (instead of the json)
- desc-AROMA_metrics.json (metadata about metrics)
- aroma.svg
What about the folder structure? My guess is the users will run it in their data folder, say sub-001/ses-01/func/
. We should create a derivatives folder or something right?
Since we're planning to use this as a node in a BIDS App (e.g., fMRIPrep), I think we should dump things into an arbitrarily-named output folder. We can always develop a BIDS App version of the CLI at some point, if there's a demand. It's in the BIDS App version where I think we'd want to try to extract subject and session information, and then develop a folder and filename structure around that.
Since we're planning to use this as a node in a BIDS App (e.g., fMRIPrep), I think we should dump things into an arbitrarily-named output folder.
Yes, it is trivial then to have some manager code that transfers everything to the right position and applies bids-derivatives naming.
I'm currently leaning toward the following outputs:
- desc-ICA_decomposition.json
- desc-ICA_mixing.tsv
- desc-ICA_components.nii.gz
- desc-ICAGammaThresholded_components.nii.gz
- desc-smoothAROMAnonaggr_bold.nii.gz
- AROMAnoiseICs.csv (if we keep them)
- desc-AROMA_metrics.tsv (instead of the json)
- desc-AROMA_metrics.json (metadata about metrics)
- aroma.svg
Yes, don't make it overly complicated - this should do!
Here is what I'm thinking the contents of the files should look like (cross-posted and lightly adapted from ME-ICA/tedana#649):
desc-ICA_mixing.tsv (required)
ica_00 | ica_01 | ica_02 | ica_03 | ica_04 | |
---|---|---|---|---|---|
0 | 1.96128 | 0.715816 | 1.81138 | 2.75372 | 1.84941 |
1 | 0.15217 | -0.85096 | -0.101651 | 0.147915 | -0.363037 |
2 | 0.326455 | -0.392816 | 1.1161 | 0.450553 | 0.580335 |
3 | 0.0141179 | -0.871712 | 1.02556 | 0.678525 | -0.0628587 |
4 | -0.0459112 | -1.02132 | 1.10107 | 0.306312 | -0.400553 |
desc-ICA_decomposition.json (required)
{
"Method": "Independent components analysis with MELODIC ICA algorithm implemented by FSL. Components are sorted by variance explained in descending order.",
"ica_00": {
"Description": "ICA fit to dimensionally-reduced data.",
"Method": "AROMA"
},
"ica_01": {
"Description": "ICA fit to dimensionally-reduced data.",
"Method": "AROMA"
},
"ica_02": {
"Description": "ICA fit to dimensionally-reduced data.",
"Method": "AROMA"
},
"ica_03": {
"Description": "ICA fit to dimensionally-reduced data.",
"Method": "AROMA"
},
"ica_04": {
"Description": "ICA fit to dimensionally-reduced data.",
"Method": "AROMA"
}
}
desc-AROMA_metrics.tsv
Component | max_RP_corr | HFC | edge_fract | csf_fract | classification | rationale |
---|---|---|---|---|---|---|
ica_00 | 0.2 | 0.8163 | 0.507241 | 0.00403574 | rejected | hfc |
ica_01 | 0.43 | 0.1357 | 0.569411 | 0.00297853 | accepted | n/a |
ica_02 | 0.7 | 0.4905 | 0.33585 | 0.00195625 | rejected | hyperplane |
ica_03 | 0.1 | 0.7926 | 0.517262 | 0.00306524 | accepted | n/a |
ica_04 | 0.25 | 0.2831 | 0.32673 | 0.00231219 | accepted | n/a |
desc-AROMA_metrics.json
{
"Component": {
"Description": "The unique identifier of each component. This identifier matches column names in the mixing matrix TSV file.",
"LongName": "Component identifier"
},
"classification": {
"Description": "Classification from the classification procedure.",
"Levels": {
"accepted": "A component determined to be unrelated to motion. Included in denoised data.",
"ignored": "A low-variance component included in denoised data.",
"rejected": "A motion-related component excluded from denoised data."
},
"LongName": "Component classification"
},
"max_RP_corr": {
"Description": "The maximum robust correlation of each component time series with a model of 72 realignment parameters.",
"LongName": "Maximum motion parameter correlation"
},
"HFC": {
"Description": "The frequency, as fraction of the Nyquist frequency, at which the higher and lower frequencies explain half of the total power between 0.01Hz and Nyquist.",
"LongName": "High-frequency content"
},
}
@tsalo This makes sense to me. There's no clear connection between metrics.tsv
and mixing.tsv
, so if this is something you'll expect to make a standard derivative in the future, I could see that as being something people would ask about. But I'm not sure it's worth trying to predict how that conversation will go before it's happened.