Change decomposition files from json back to tsv

Question

Change decomposition files from json back to tsv

Closed this issue 4 years ago · 6 comments

Summary

We originally had our component tables stored in TSV format, but we switched to JSONs in nipreps/fmriprep#456 to better match BIDS Derivatives convention. However, after discussing the issue with some of the BIDS folks recently (please see nipreps/fmripost-aroma#10), it sounds like a TSV is more flexible and is BIDS Derivatives-compatible. We can still place metadata about the components/decomposition in a JSON.

Additional Detail

Next Steps

Review BEP012 to determine what should go in the json and what should go in the tsv.
Restructure outputs so that the core component table is outputted in a tsv.
Update documentation.
Update visualization code.
Update component table ingestion step in the tedana workflow.

Answer 1 · 2021-02-06T20:33:43.000Z

Somehow I forgot updates to the fMRIPrep/BIDS conversation (see summary at nipreps/fmripost-aroma#10), but basically the BIDS maintainers agreed to stick with json files for the immediate future. However, I'm still leaning toward using a TSV here and in the aroma repo, because TSVs are easier to read and because we can use a standard TSV/JSON combo to have our metrics in the TSV and metadata about the metrics in the JSON. However, I want to make sure others agree before I implement this. What does everyone think?

Answer 2 · 2021-02-06T20:46:42.000Z

Here's what I'm thinking:

desc-ICA_mixing.tsv (required)

	ica_00	ica_01	ica_02	ica_03	ica_04
0	1.96128	0.715816	1.81138	2.75372	1.84941
1	0.15217	-0.85096	-0.101651	0.147915	-0.363037
2	0.326455	-0.392816	1.1161	0.450553	0.580335
3	0.0141179	-0.871712	1.02556	0.678525	-0.0628587
4	-0.0459112	-1.02132	1.10107	0.306312	-0.400553

desc-ICA_decomposition.json (required)

{
    "Method": "Independent components analysis with FastICA algorithm implemented by sklearn. Components are sorted by Kappa in descending order. Component signs are flipped to best match the data.",
    "ica_00": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_01": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_02": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_03": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    },
    "ica_04": {
        "Description": "ICA fit to dimensionally-reduced optimally combined data.",
        "Method": "tedana"
    }
}

desc-ICA_metrics.tsv (new)

Component	kappa	rho	variance explained	normalized variance explained	countsigFR2	countsigFS0	dice_FR2	dice_FS0	countnoise	signal-noise_t	signal-noise_p	d_table_score	kappa ratio	d_table_score_scrub	classification	rationale
ica_00	106.314	17.8163	0.507241	0.00403574	3856	1826	0.514541	0.243511	674	14.2077	7.00595e-42	4.2	0.292705	2.4	accepted	nan
ica_01	91.4559	18.1357	0.569411	0.00297853	4478	2762	0.569481	0.301275	913	8.30499	4.8039e-16	6.4	0.381962	4.4	accepted	nan
ica_02	81.9524	19.4905	0.33585	0.00195625	3506	2334	0.520029	0.2966	809	12.2543	2.09485e-31	7.4	0.251414	5	accepted	nan
ica_03	74.4902	15.7926	0.517262	0.00306524	4043	2407	0.54551	0.189069	897	13.9354	2.5095e-40	5	0.426008	3	accepted	nan
ica_04	72.7947	17.2831	0.32673	0.00231219	3986	3015	0.52485	0.238084	832	13.7875	4.04151e-39	5.4	0.275356	3.6	accepted	nan

desc-ICA_metrics.json (new)

{
    "Component": {
        "Description": "The unique identifier of each component. This identifier matches column names in the mixing matrix TSV file.",
        "LongName": "Component identifier"
    },
    "classification": {
        "Description": "Classification from the classification procedure.",
        "Levels": {
            "accepted": "A BOLD-like component included in denoised and high-Kappa data.",
            "ignored": "A low-variance component included in denoised, but excluded from high-Kappa data.",
            "rejected": "A non-BOLD component excluded from denoised and high-Kappa data."
        },
        "LongName": "Component classification"
    },
    "countnoise": {
        "Description": "Number of 'noise' voxels (voxels highly weighted for component, but not from clusters) from each component.",
        "LongName": "Noise voxel count",
        "Units": "voxel"
    },
    "countsigFR2": {
        "Description": "Number of significant voxels from the cluster-extent thresholded R2 model F-statistic map for each component.",
        "LongName": "R2 model F-statistic map significant voxel count",
        "Units": "voxel"
    },
}

Answer 3 · 2021-02-07T09:31:52.000Z

Sounds good to me!

Answer 4 · 2021-02-09T21:35:46.000Z

They look beautiful ! Per the linked fMRIPrep discussion, though, does that mean these files are not BIDS compatible for the moment ?

Answer 5 · 2021-02-09T21:39:26.000Z

From what I recall, there is nothing in the derivatives spec prohibiting them, and I think that they are ultimately cleaner than a pure-json approach, but I can't guarantee that other tools (especially fMRIPrep) will want to adopt this approach. That's part of why I want input on ME-ICA/aroma#5 as well.

Answer 6 · 2021-02-10T16:55:33.000Z

One other thing we could do is use desc-tedana instead of desc-ICA for the metric files (see ME-ICA/aroma#5), in order to support multiple algorithms. I don't know how to link the metric files to the decomposition files in that case though... but it's probably something worth looking into.