Change decomposition files from json back to tsv
Closed this issue · 6 comments
Summary
We originally had our component tables stored in TSV format, but we switched to JSONs in nipreps/fmriprep#456 to better match BIDS Derivatives convention. However, after discussing the issue with some of the BIDS folks recently (please see nipreps/fmripost-aroma#10), it sounds like a TSV is more flexible and is BIDS Derivatives-compatible. We can still place metadata about the components/decomposition in a JSON.
Additional Detail
Next Steps
- Review BEP012 to determine what should go in the json and what should go in the tsv.
- Restructure outputs so that the core component table is outputted in a tsv.
- Update documentation.
- Update visualization code.
- Update component table ingestion step in the tedana workflow.
Somehow I forgot updates to the fMRIPrep/BIDS conversation (see summary at nipreps/fmripost-aroma#10), but basically the BIDS maintainers agreed to stick with json files for the immediate future. However, I'm still leaning toward using a TSV here and in the aroma
repo, because TSVs are easier to read and because we can use a standard TSV/JSON combo to have our metrics in the TSV and metadata about the metrics in the JSON. However, I want to make sure others agree before I implement this. What does everyone think?
Here's what I'm thinking:
desc-ICA_mixing.tsv (required)
ica_00 | ica_01 | ica_02 | ica_03 | ica_04 | |
---|---|---|---|---|---|
0 | 1.96128 | 0.715816 | 1.81138 | 2.75372 | 1.84941 |
1 | 0.15217 | -0.85096 | -0.101651 | 0.147915 | -0.363037 |
2 | 0.326455 | -0.392816 | 1.1161 | 0.450553 | 0.580335 |
3 | 0.0141179 | -0.871712 | 1.02556 | 0.678525 | -0.0628587 |
4 | -0.0459112 | -1.02132 | 1.10107 | 0.306312 | -0.400553 |
desc-ICA_decomposition.json (required)
{
"Method": "Independent components analysis with FastICA algorithm implemented by sklearn. Components are sorted by Kappa in descending order. Component signs are flipped to best match the data.",
"ica_00": {
"Description": "ICA fit to dimensionally-reduced optimally combined data.",
"Method": "tedana"
},
"ica_01": {
"Description": "ICA fit to dimensionally-reduced optimally combined data.",
"Method": "tedana"
},
"ica_02": {
"Description": "ICA fit to dimensionally-reduced optimally combined data.",
"Method": "tedana"
},
"ica_03": {
"Description": "ICA fit to dimensionally-reduced optimally combined data.",
"Method": "tedana"
},
"ica_04": {
"Description": "ICA fit to dimensionally-reduced optimally combined data.",
"Method": "tedana"
}
}
desc-ICA_metrics.tsv (new)
Component | kappa | rho | variance explained | normalized variance explained | countsigFR2 | countsigFS0 | dice_FR2 | dice_FS0 | countnoise | signal-noise_t | signal-noise_p | d_table_score | kappa ratio | d_table_score_scrub | classification | rationale |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ica_00 | 106.314 | 17.8163 | 0.507241 | 0.00403574 | 3856 | 1826 | 0.514541 | 0.243511 | 674 | 14.2077 | 7.00595e-42 | 4.2 | 0.292705 | 2.4 | accepted | nan |
ica_01 | 91.4559 | 18.1357 | 0.569411 | 0.00297853 | 4478 | 2762 | 0.569481 | 0.301275 | 913 | 8.30499 | 4.8039e-16 | 6.4 | 0.381962 | 4.4 | accepted | nan |
ica_02 | 81.9524 | 19.4905 | 0.33585 | 0.00195625 | 3506 | 2334 | 0.520029 | 0.2966 | 809 | 12.2543 | 2.09485e-31 | 7.4 | 0.251414 | 5 | accepted | nan |
ica_03 | 74.4902 | 15.7926 | 0.517262 | 0.00306524 | 4043 | 2407 | 0.54551 | 0.189069 | 897 | 13.9354 | 2.5095e-40 | 5 | 0.426008 | 3 | accepted | nan |
ica_04 | 72.7947 | 17.2831 | 0.32673 | 0.00231219 | 3986 | 3015 | 0.52485 | 0.238084 | 832 | 13.7875 | 4.04151e-39 | 5.4 | 0.275356 | 3.6 | accepted | nan |
desc-ICA_metrics.json (new)
{
"Component": {
"Description": "The unique identifier of each component. This identifier matches column names in the mixing matrix TSV file.",
"LongName": "Component identifier"
},
"classification": {
"Description": "Classification from the classification procedure.",
"Levels": {
"accepted": "A BOLD-like component included in denoised and high-Kappa data.",
"ignored": "A low-variance component included in denoised, but excluded from high-Kappa data.",
"rejected": "A non-BOLD component excluded from denoised and high-Kappa data."
},
"LongName": "Component classification"
},
"countnoise": {
"Description": "Number of 'noise' voxels (voxels highly weighted for component, but not from clusters) from each component.",
"LongName": "Noise voxel count",
"Units": "voxel"
},
"countsigFR2": {
"Description": "Number of significant voxels from the cluster-extent thresholded R2 model F-statistic map for each component.",
"LongName": "R2 model F-statistic map significant voxel count",
"Units": "voxel"
},
}
Sounds good to me!
They look beautiful ! Per the linked fMRIPrep discussion, though, does that mean these files are not BIDS compatible for the moment ?
From what I recall, there is nothing in the derivatives spec prohibiting them, and I think that they are ultimately cleaner than a pure-json approach, but I can't guarantee that other tools (especially fMRIPrep) will want to adopt this approach. That's part of why I want input on ME-ICA/aroma#5 as well.
One other thing we could do is use desc-tedana
instead of desc-ICA
for the metric files (see ME-ICA/aroma#5), in order to support multiple algorithms. I don't know how to link the metric files to the decomposition files in that case though... but it's probably something worth looking into.