Add an extra column in the output of conditionTest() with log2fc results
gabrielnegreira opened this issue · 3 comments
Hi,
First, thanks a lot for this package! I am currently using the conditionTest() function to look for DE genes along a trajectory between two samples I have (2 conditions). The function returns a table with genes, the WaldStat, the degrees of freedom and the p-values, but I wonder if it would not be possible to include an extra column with the log2 fold-change between both conditions? I imagine this should be possible as there is an option to set a log2fc threshold, so the function is calculating fold changes between conditions, right?
For me, it would be interesting to have the fold changes of each gene because if I set the l2fc threshold to log2(2) I get only 24 with DE genes (with p < 0.05), but if I set it to log2(1.5) I get 200 genes (also with p < 0.05). So I would like to keep the threshold at 1.5 and check, among those genes, which are the most DE ones (not necessarily at log2(2)). Thus this extra column with log2fc values would be very helpful.
Hi @gabrielnegreira,
Note that inference between lineages happens on a number of grid points along the trajectory. In the case of the conditionTest
, we are comparing two smoothers (one for each condition) within a lineage, and assess DE on each grid point. This means we have a number of fold-changes for each gene (e.g. if we are using 10 grid points, we'd be testing 10 fold-changes).
In previous tests, we have been providing the median log FC across the grid points, but it turns out that this is actually not very helpful. It would be useful to discuss what effect size measure could be deemed most useful for ranking the genes as you suggest. For example, would the maximum log2 fold-change be relevant?
Hi @koenvandenberge ,
Thanks a lot for the reply. Indeed, I think the maximum log2 fold-change between conditions could be helpful. Maybe also including an extra column indicating in which gridpoint that maximum happens?
@koenvandenberge
For what it's worth, maximum would be useful, but what about a per-knot Log2FC reported as a list of doubles ordered by knot number?
eg, for a 5 knot fit,
0.0101, 0.552, 0.021, 0.00001, 0.0093
Then the user can decide how to repor the log2FC?
regards,
Kieran