gavinmdouglas/FuncDiv

Unifrac implementation issues

Closed this issue · 2 comments

I was reading the FuncDiv paper and this sentence caught my attention: "UniFrac methods are implemented based on a modified version of the UniFrac function from phyloseq".

We recently came across the phyloseq issue 956. Based on this it seems that there are some important differences in Unifrac calculation between qiime2 and phyloseq. The conclusion seem to be that rbiom matches best with QIIME2 (and gives similar results to 2 other independent R implementations as well but not with phyloseq). It is also the fastest R implementation.

This might be something to check.

Thanks a lot for pointing this out, I was unaware of this problem!

Based on this issue, I decided to swap in the rbiom UniFrac implementation (commit 3304f8f), which is now used instead of the PhyloSeq implementation. I will make this clear in the release notes for the next release!

Edit: I should clarify, in case anyone else is looking into this, that the unweighted UniFrac results appeared to be identical between the two approaches. The weighted results did differ, although it's not clear to me whether this was due to whether the distances normalized or not, or whether it was an actual bug. Either way, I think it makes sense to switch the rbiom approach if only to be more consistent with how QIIME 2 calculates weighted UniFrac.

Thanks again,

Gavin