This is a repo merging NFacT (Shaun Warrington, Ellie Thompson, and Stamatios Sotiropoulos) and ptx_decomp (Saad Jbabdi and Rogier Mars).
NFACT (Non-negative matrix Factorisation of Tractography data) is a set of modules (as well as an end to end pipeline) that decomposes tractography data using NMF/ICA.
It consists of three "main" decomposition modules:
- nfact_pp (Pre-process data for decomposition)
- nfact_decomp (Decomposes a single or average group matrix using NMF or ICA)
- nfact_dr (Dual regression on group matrix)
as well as three axillary "modules":
- nfact_config (creates config files for the pipeline and changing any hyperparameters)
- nfact_Qc (Creates hitmaps to check for bias in decomposition)
- nfact_glm (To run hypothesis testing)
and a pipeline wrapper
- nfact (runs either all three pre-processing modules or just nfact_decomp and nfact_dr)
_ _ ______ ___ _____ _____
| \ | || ___| / \ / __ \|_ _|
| \| || |_ / /_\ \| / \/ | |
| || _| | _ || | | |
| |\ || | | | | || \__/\ | |
\_| \_/\_| \_| |_/ \____/ \_/
This pipeline runs nfact_pp, nfact_decomp and nfact_dr on tractography data that has been processed by bedpostx.
The pipeline first creates the omatrix2
usage: nfact [-h] [-l LIST_OF_SUBJECTS] [-s SEED [SEED ...]] [-c CONFIG] [-S] [-i REF] [-b BPX_PATH] [-w WARPS [WARPS ...]] [-r ROIS [ROIS ...]] [-t TARGET2] [-d DIM] [-o OUTDIR] [-a ALGO]
options:
-h, --help show this help message and exit
Inputs:
-l LIST_OF_SUBJECTS, --list_of_subjects LIST_OF_SUBJECTS
Filepath to a list of subjects.
-s SEED [SEED ...], --seed SEED [SEED ...]
A single or list of seeds
-c CONFIG, --config CONFIG
An nfact_config file. If this is provided no other arguments are needed.
-S, --skip Skips NFACT_PP. Pipeline still assumes that NFACT_PP has been ran before.
PP:
-i REF, --image_standard_space REF
Standard space reference image
-b BPX_PATH, --bpx BPX_PATH
Path to Bedpostx folder inside a subjects directory.
-w WARPS [WARPS ...], --warps WARPS [WARPS ...]
Path to warps inside a subjects directory (can accept multiple arguments)
-r ROIS [ROIS ...], --rois ROIS [ROIS ...]
A single or list of ROIS
-t TARGET2, --target TARGET2
Path to target image. If not given will create a whole mask from reference image
decomp:
-d DIM, --dim DIM Number of dimensions/components
-o OUTDIR, --outdir OUTDIR
Path to where to create an output folder
-a ALGO, --algo ALGO What algorithm to run. Options are: ICA (default), or NMF.
example call:
nfact --list_of_subject /absolute path/sub_list \
--seed thalamus.nii.gz \
--algo NMF \
--dim 100 \
--outdir /absolute path/save directory \
--warps standard2acpc_dc.nii.gz acpc_dc2standard.nii.gz \
--ref $FSLDIR/data/standard/MNI152_T1_2mm_brain.nii.gz \
--bpx Diffusion.bedpostX
With a config file:
nfact –config /absolute path/nfact_config.config
_ _ ______ ___ _____ _____ ______ ______
| \ | || ___| / \ / __ \|_ _| | ___ \| ___ \
| \| || |_ / /_\ \| / \/ | | | |_/ /| |_/ /
| || _| | _ || | | | | __/ | __/
| |\ || | | | | || \__/\ | | | | | |
\_| \_/\_| \_| |_/ \____/ \_/ \_| \_|
Pre-processing of tractgraphy data for decomposition with NFacT (Non-negative matrix Factorisation of Tractography data)
Under the hood NFACT PP is probtrackx2 omatrix2 option to get a seed by target connectivity matrix
Required before runing NFACT PP: - crossing-fibre diffusion modelled data (bedpostX) - Seeds (either surfaces or volumes)
NFACT PP has three streams, surface seed, volume, mode and filestree.
Required input: - List of subjects - Output directory
Input needed for filestree mode: - .tree file (NFACT_PP comes with some defaults such as hcp)
Input needed for both surface and volume mode: - Seeds path inside folder - Warps path inside a subjects folder - bedpostx folder path inside a subjects folder
Input for surface seed mode: - Seeds as surfaces - ROIs as surfaces (medial wall)
Input needed for volume mode: - Seeds as volumes
NFACT pp can be used in a folder agnostic way by providing the paths to seeds/bedpostX/target inside a subject folder (i.e --seeds seeds/amygdala.nii.gz).
The other way is to use the --file_tree command with the name of a file tree (see https://open.win.ox.ac.uk/pages/fsl/file-tree/index.html for further details on filetree). In this case seeds/rois/bedpostx do not need to be specified as nfact_pp will try and find the appriopriate files.
nfact_pp --file_tree hcp --list_of_subjects /home/study/list_of_subjects
Filetrees are saved in filetrees folder in nfact, so custom filetrees can be put there and called similar to the command above. NFACT_PP currently has a built in a filetree for HCP (from qunex output) to perform full brain tractography.
seed files are aliased as (seed), medial wall as (medial_wall), warps as (diff2std, std2diff) and bedpostX as (bedpostX). Two seeds are supported if the seeds are bilateral indicated with {hemi}.seed, with the actual seed names being L.seed.nii.gz/R.seed.nii.gz. A singe seed can be given as well.
usage: nfact_pp [-h] [-hh] [-O] [-l LIST_OF_SUBJECTS] [-o OUTDIR] [-f FILE_TREE] [-s SEED [SEED ...]] [-w WARPS [WARPS ...]] [-b BPX_PATH] [-m MEDIAL_WALL [MEDIAL_WALL ...]] [-i REF] [-t TARGET2] [-N NSAMPLES] [-mm MM_RES] [-p PTX_OPTIONS] [-e EXCLUSION]
[-S [STOP ...]] [-n N_CORES] [-C] [-cq CLUSTER_QUEUE] [-cr CLUSTER_RAM] [-ct CLUSTER_TIME] [-cqos CLUSTER_QOS]
options:
-h, --help show this help message and exit
-hh, --verbose_help Prints help message and example usages
-O, --overwrite Overwrite previous file structure
Compulsory Arguments:
-l LIST_OF_SUBJECTS, --list_of_subjects LIST_OF_SUBJECTS
A list of subjects in text form. If not provided NFACT PP will use all subjects in the study folder. All subjects need full file path to subjects directory
-o OUTDIR, --outdir OUTDIR
Directory to save results in
REQUIRED FOR FILETREE MODE: :
-f FILE_TREE, --file_tree FILE_TREE
Use this option to provide name of predefined file tree to perform whole brain tractography. NFACT_PP currently comes with HCP filetree. See documentation for further information.
Tractography options: :
-s SEED [SEED ...], --seed SEED [SEED ...]
A single or list of seeds
-w WARPS [WARPS ...], --warps WARPS [WARPS ...]
Path to warps inside a subjects directory (can accept multiple arguments)
-b BPX_PATH, --bpx BPX_PATH
Path to Bedpostx folder inside a subjects directory.
-m MEDIAL_WALL [MEDIAL_WALL ...], --medial_wall MEDIAL_WALL [MEDIAL_WALL ...]
REQUIRED FOR SURFACE MODE: Medial wall file. Use when doing whole brain surface tractography to provide medial wall.
-i REF, --ref REF Standard space reference image. Default is $FSLDIR/data/standard/MNI152_T1_2mm_brain.nii.gz
-t TARGET2, --target TARGET2
Name of target. If not given will create a whole mask from reference image
-N NSAMPLES, --nsamples NSAMPLES
Number of samples per seed used in tractography (default = 1000)
-mm MM_RES, --mm_res MM_RES
Resolution of target image (Default = 2 mm)
-p PTX_OPTIONS, --ptx_options PTX_OPTIONS
Path to ptx_options file for additional options
-e EXCLUSION, --exclusion EXCLUSION
Path to an exclusion mask. Will reject pathways passing through locations given by this mask
-S [STOP ...], --stop [STOP ...]
Use wtstop and stop in the tractography. Takes a file path to a json file containing stop and wtstop masks, JSON keys must be stopping_mask and wtstop_mask. Argument can be used with the --filetree, in that case no json file is needed.
Parallel Processing arguments:
-n N_CORES, --n_cores N_CORES
If should parallel process and with how many cores
Cluster Arguments:
-C, --cluster Use cluster enviornment
-cq CLUSTER_QUEUE, --queue CLUSTER_QUEUE
Cluster queue to submit to
-cr CLUSTER_RAM, --cluster_ram CLUSTER_RAM
Ram that job will take. Default is 60
-ct CLUSTER_TIME, --cluster_time CLUSTER_TIME
Time that job will take. nfact_pp will assign a time if none given
-cqos CLUSTER_QOS, --cluster_qos CLUSTER_QOS
Set the qos for the cluster
Example Usage:
Seed mode:
nfact_pp --list_of_subjects /home/study/sub_list
--outdir /home/study
--bpx_path /path_to/.bedpostX
--seeds /path_to/L.white.32k_fs_LR.surf.gii /path_to/R.white.32k_fs_LR.surf.gii
--rois /path_to/L.atlasroi.32k_fs_LR.shape.gii /path_to/R.atlasroi.32k_fs_LR.shape.gii
--warps /path_to/stand2diff.nii.gz /path_to/diff2stand.nii.gz
--n_cores 3
Volume mode:
nfact_pp --list_of_subjects /home/study/sub_list
--bpx_path /path_to/.bedpostX
--seeds /path_to/L.white.nii.gz /path_to/R.white.nii.gz
--warps /path_to/stand2diff.nii.gz /path_to/diff2stand.nii.gz
--ref MNI152_T1_1mm_brain.nii.gz
--target dlpfc.nii.gz
Filestree mode:
nfact_pp --filestree hcp
--list_of_subjects /home/study/sub_list
--outdir /home/study
--n_cores 3
------------------------------------------------------------------------------------------------------------------------------------------
| \ | || | / _ \ / __ | | | _ | |/ __ | _ || / || ___
| | || | / /\ | / / | | | | | || | | / /| | | || . . || |/ /
| . ` || | | _ || | | | | | | || __| | | | | | || |/| || __/
| |\ || | | | | || _/\ | | | |/ / | |__ | _/\ _/ /| | | || |
_| _/_| _| |/ _/ _/ |/ _/ _/ ___/ _| |_/_|
## NFACT decomp
This is the main decompoisition module of NFACT. Runs either ICA or NMF and saves the components. Components can also be normalised and winner takes all maps
created.
### Usage
usage: nfact [-h] [-l LIST_OF_SUBJECTS] [-o OUTDIR] [-d DIM] [--seeds SEEDS] [-m MIGP] [-a ALGO] [-W] [-z WTA_ZTHR] [-N] [-S] [-O] [-c CONFIG]
options: -h, --help show this help message and exit -l LIST_OF_SUBJECTS, --list_of_subjects LIST_OF_SUBJECTS REQUIRED: Filepath to a list of subjects. List can contain a single subject. -o OUTDIR, --outdir OUTDIR REQUIRED: Path to output folder -d DIM, --dim DIM REQUIRED: Number of dimensions/components --seeds SEEDS, -s SEEDS REQUIRED: File of seeds used in NFACT_PP/probtrackx -m MIGP, --migp MIGP MELODIC's Incremental Group-PCA dimensionality (default is 1000) -a ALGO, --algo ALGO What algorithm to run. Options are: ICA (default), or NMF. -W, --wta Save winner-takes-all maps -z WTA_ZTHR, --wta_zthr WTA_ZTHR Winner-takes-all threshold (default=0.) -N, --normalise normalise components by scaling -S, --sign_flip sign flip components -O, --overwrite Overwrite previous file structure. Useful if wanting to perform multiple GLMs or ICA and NFM -c CONFIG, --config CONFIG Provide config file to change hyperparameters for ICA and NFM. Please see sckit learn documentation for NFM and FASTICA for further details
An example call
nfact_decomp --list_of_subjects /absolute path/sub_list
--seeds /absolute path/seeds.txt
--outdir /absolute path/study_directory
--algo ICA
--migp 1000
--dim 100 --normalise --wta –sign_flip \
------------------------------------------------------------------------------------------------------------------------------------------
| \ | || | / _ \ / __ | | | _ | ___
| | || | / /\ | / / | | | | | || |/ /
| . ` || | | _ || | | | | | | || /
| |\ || | | | | || _/\ | | | |/ / | |\
_| _/_| _| |/ _/ _/ |_/ _| _|
## NFACT Dr
This is the dual regression module of NFACT. Depending on which decompostion method was used depends on which
dual regression technique will be used. If NMF was used then non-negative least squares regression will be used, if ICA
then it will be standard regression.
### Usage
usage: nfact_dr [-h] [-l LIST_OF_SUBJECTS] [-o OUTDIR] [-a ALGO] [--seeds SEEDS] [-n NFACT_DECOMP_DIR] [-d DECOMP_DIR] [-N]
options: -h, --help show this help message and exit -l LIST_OF_SUBJECTS, --list_of_subjects LIST_OF_SUBJECTS REQUIRED: Filepath to a list of subjects -o OUTDIR, --outdir OUTDIR REQUIRED: Path to output directory -a ALGO, --algo ALGO REQUIRED: Which NFACT algorithm to perform dual regression on --seeds SEEDS, -s SEEDS REQUIRED: File of seeds used in NFACT_PP/probtrackx -n NFACT_DECOMP_DIR, --nfact_decomp_dir NFACT_DECOMP_DIR REQUIRED IF NFACT_DECOMP: Filepath to the NFACT_decomp directory. Use this if you have ran NFACT decomp -d DECOMP_DIR, --decomp_dir DECOMP_DIR REQUIRED IF NOT NFACT_DECOMP: Filepath to decomposition components. WARNING NFACT decomp expects components to be named in a set way. See documentation for further info. -N, --normalise normalise components by scaling
nfact_dr is independent from nfact_decomp however, nfact_decomp expects a strict naming convention of files. If nfact_decomp has not been ran then group average files and components must all be in the same folder. Components must be named W_dim* and G_dim* with group average files named coords_for_fdt_matrix2, lookup_tractspace_fdt_matrix2.nii.gz.
------------------------------------------------------------------------------------------------------------------------------------------
| \ | || | / _ \ / __ | | / _ \ / | | | || | / /\ | / / | | | | | | | | | . ` || | | _ || | | | | | | | | | | |\ || | | | | || _/\ | | | || | | | _| _/_| _| |/ _/ _/ __\ ___|
## NFACT Qc
This is a qulaity control module that creates a number of hitmaps that can be used to check for bias in decomposition.
Each map contains the number of times that voxel/vertex appears in the decomposition.
## Output:
Prefix:
- hitmap_*.nii.gz: Volume nii component. Components are thresholded by zscoring to remove noise
- hitmap_*_raw.nii.gz: Volume nii component. Components are not thresholded
- mask_*.nii.gz: Volume nii component. Binary mask of thresholded components
- mask_*_raw.nii.gz: Volume nii component. Binary mask of unthresholded components
- *.gii: Surface gii component. Components are thresholded by zscoring to remove noise
- *_raw.gii: Surface gii component. Components are not thresholded
## Usage:
usage: nfact [-h] [-n NFACT_FOLDER] [-d DIM] [-a ALGO] [-t THRESHOLD] [-O]
options: -h, --help show this help message and exit -n NFACT_FOLDER, --nfact_folder NFACT_FOLDER REQUIRED: Path to nfact output folder -d DIM, --dim DIM REQUIRED: Number of dimensions/components -a ALGO, --algo ALGO REQUIRED:What algorithm to qc. Options are: NMF (default), or ICA. -t THRESHOLD, --threshold THRESHOLD Threshold value for z scoring the normalised image -O, --overwrite Overwite previous QC
------------------------------------------------------------------------------------------------------------------------------------------
| \ | | / _ / __ _ | / ()
| | | | / /\ \ / / | | ___ ___ _ __ | | _ __ _
| . | _|| _ | | | | / __/ _ \| '_ \| _| |/ _
|
| |\ | | | | | | _/\ | | | (| () | | | | | | | (| |
_| _|| _| |/_/ _/ ____/|| ||| ||_, |
/ |
|/
## NFACT config
NFACT config is a util tool for nfact, that creates a variety of config files to be used in nfact.
NFACT config can create:
1) nfact_config_pipeline.config overview. This config json file is used in the nfact pipeline to have greater control over parameters.
2) nfact_config_decomp.config. A config file to control the hypereparameters of the ICA and NMF functions.
3) nfact_config_sublist. A list of subjects in a folder.
## Usage:
usage: nfact_config [-h] [-C] [-D] [-s SUBJECT_LIST] [-o OUTPUT_DIR]
options: -h, --help show this help message and exit -C, --config Creates a config file for NFACT pipeline -D, --decomp_only Creates a config file for sckitlearn function hyperparameters -s SUBJECT_LIST, --subject_list SUBJECT_LIST Creates a subject list from a given directory -o OUTPUT_DIR, --output_dir OUTPUT_DIR Where to save config file
Altering a boolean value in a json is done by giving then everything has to be lower case i.e true, false. It is advised that unless you are familar with json
files to use a json linter to check they are valid.
### nfact_config_pipeline.config overview
This is the config file for the nfact pipeline. Please check the individual modules for further details on arguments.
{ "global_input": { "list_of_subjects": "Required", "outdir": "Required", "seed": [ "Required unless file_tree specified" ], "overwrite": false, "skip": false }, "nfact_pp": { "warps": [], "bpx_path": false, "rois": [], "file_tree": false, "ref": false, "target2": false, "nsamples": "1000", "mm_res": "2", "ptx_options": false, "n_cores": false, "cluster": false }, "nfact_decomp": { "dim": "Required", "migp": "1000", "algo": "ICA", "wta": false, "wta_zthr": "0.0", "normalise": false, "sign_flip": false, "config": false }, "nfact_dr": { "normalise": false } }
Everything that has says is required must be given. rois, warps and seed must be given in python list format like this
"seed": ["l_seed.nii.gz", "r_seed.nii.gz]
### nfact_config_decomp.config
This is the nfact_config_decomp.config file.
NFACT does its decomposition using sckit learn's FastICA (https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.FastICA.html#sklearn.decomposition) and NFM (https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html) so any of the hyperparameters of these functions can be altered by changing the values in the json file.
{ "ica": { "algorithm": "parallel", "whiten": "unit-variance", "fun": "logcosh", "fun_args": null, "max_iter": 200, "tol": 0.0001, "w_init": null, "whiten_solver": "svd", "random_state": null }, "nmf": { "init": null, "solver": "cd", "beta_loss": "frobenius", "tol": 0.0001, "max_iter": 200, "random_state": null, "alpha_W": 0.0, "alpha_H": "same", "l1_ratio": 0.0, "verbose": 0, "shuffle": false } }
### nfact_config_sublist
NFACT config will attempt to given a directory work out and write to a file all the subjects in that file. Though nfact will try and filter out
folders that aren't subjects, it isn't perfect so please check the subject list.
------------------------------------------------------------------------------------------------------------------------------------------
| \ | || | / _ \ / __ | | | __ | | | / | | | || | / /\ | / / | | | | /| | | . . | | . ` || | | _ || | | | | | __ | | | |/| | | |\ || | | | | || _/\ | | | |\ | || | | | _| _/_| _| |/ _/ _/ _/_/_| |/
This is currently a work in progress module. The aim is to support hypothesis testing between groups.