biocore/qadabra

Qadabra Snakemake file using unexpected p-values from corncob and metagenomeseq

Opened this issue · 1 comments

I was reviewing the Snakemake file that shows which fields from each tool are being used for the differentials/p-values when I noticed something unexpected regarding the values used from corncob and metagenomeseq. It seems as though thepvalues field from metagenomeseq is being used instead of adjPvalues and the fit.p field from corncob is being used instead of adjusted_p_values. At first I thought this could be because Qadabra performs FDR correction internally, but in that case I'd expect the p-value fields for the other tools to be different as well.

Here is a link to the Snakemake file I'm referring to: https://github.com/biocore/qadabra/blob/main/qadabra/workflow/rules/common.smk .
Here's the snippet that shows where the p-values are coming from:

def get_pvalue_tool_columns(wildcards):
    d = datasets.loc[wildcards.dataset].to_dict()
    covariate = d["factor_name"]
    target = d["target_level"]
    reference = d["reference_level"]

    columns = {
        "edger": "PValue",
        "deseq2": "pvalue",
        "ancombc": "pvals",
        "aldex2": f"model.{covariate}{target} Pr(>|t|)",
        "maaslin2": "pval",
        "metagenomeseq": "pvalues",
        "corncob": "fit.p",
    }
    return columns[wildcards.tool]

Can someone please explain what's going on there? I'll continue to look into the documentation for each tool and update this issue if I find anything relevant.

@411an13 Hi Allan! Thanks for catching this. I created a new PR #65 that changes get_pvalue_tool_columns to use the name of the p-value corrected column instead of the non-corrected p-value column. We will merge this shortly - thank you for your patience!!