asoroosh/DVARS

Suggestion: generating matrix of outliers

parekhpravesh opened this issue · 2 comments

Hello,

Thank you for this extensive work and the code. It is really helpful and the plots look amazing too! Perhaps the code can additionally save a .txt/.mat file which has a matrix of 0's and 1's marking the (pairs of) volumes that are considered as outliers. This would be similar to the output file generated by FSL. Of course, its not at all difficult to do this but users might find it useful. This can come in two flavours: outliers detected using statistical method and outliers detected based on practical significance.

Regards
Pravesh

Thanks Pravesh,

We do ask users to only use combination of practically & statistically significant data-points to comprise the binary regressors. This is mainly due to sensitivity of the statistical inference on clean data-sets with large T (e.g. HCP).

Using the practical significance, however requires an arbitrary threshold on Delta p Dvars (i.e. Stat.DeltapDvar) and this motivates me not to generate a binary regressor and leave it to the user as the threshold may differ from study to study (mainly due to the length of the time series).

I added some lines to the function usage which can help one to form such binary regressor, see:

%   To generate a binary regressor, where the significant DVARS data-points 
%   are 1 and the remaining data-points are 0 you can use DVARSCalc.m as 
%   below:
%   
%   PracticalSigThr = 5;  
%   idx = find(Stat.pvals<0.05./(T-1) & Stat.DeltapDvar>PracticalSigThr);
%   DVARSreg = zeros(T0,1);
%   DVARSreg(idx)   = 1;
%   DVARSreg(idx+1) = 1;
%
%   Variable PracticalSigThr should be chosen manually for a study. For
%   example in case of HCP, we found 5% is a reasonable threshold to
%   identify the practically significant data-points. Note that pratically
%   significant data-points are subset of statistically significant
%   data-points. 

% To generate a binary regressor, where the significant DVARS data-points

Cheers,

-Soroosh

Hi,
I agree that the arbitrary threshold would be dependent on the length of time series and would certainly be affected by site related decisions. If, for example, this method is applied to task fMRI, then perhaps one should ensure they are not disproportionately regressing out scans of one condition as compared to another condition (probably better to remove that subject from the analysis). These subtleties are probably best left to the users.

I think the edited documentation is perfect! It is clear and helpful. Thanks!