Interpretation of cami-report vs binning-file
Closed this issue · 2 comments
dalofa commented
Hi,
Once again thanks for writing taxor!
I am wondering how I should interpret the CAMI-report vs the binning-file. Looking at the CAMI-report it seems the read stem from two species:
cat barcode43.cami
@SampleID:barcode43
@Version:0.10.0
@Ranks:superkingdom|phylum|class|order|family|genus|species
@@TAXID RANK TAXPATH TAXPATHSN PERCENTAGE
2 superkingdom 2 Bacteria 100
1224 phylum 2|1224 Bacteria|Pseudomonadota 12.9099
1239 phylum 2|1239 Bacteria|Bacillota 87.0901
28216 class 2|1224|28216 Bacteria|Pseudomonadota|Betaproteobacteria 12.9099
91061 class 2|1239|91061 Bacteria|Bacillota|Bacilli 87.0901
1385 order 2|1239|91061|1385 Bacteria|Bacillota|Bacilli|Bacillales 87.0901
80840 order 2|1224|28216|80840 Bacteria|Pseudomonadota|Betaproteobacteria|Burkholderiales 12.9099
119060 family 2|1224|28216|80840|119060 Bacteria|Pseudomonadota|Betaproteobacteria|Burkholderiales|Burkholderiaceae 12.9099
90964 family 2|1239|91061|1385|90964 Bacteria|Bacillota|Bacilli|Bacillales|Staphylococcaceae 87.0901
1279 genus 2|1239|91061|1385|90964|1279 Bacteria|Bacillota|Bacilli|Bacillales|Staphylococcaceae|Staphylococcus 87.0901
1822464 genus 2|1224|28216|80840|119060|1822464 Bacteria|Pseudomonadota|Betaproteobacteria|Burkholderiales|Burkholderiaceae|Paraburkholderia 12.9099
1290 species 2|1239|91061|1385|90964|1279|1290 Bacteria|Bacillota|Bacilli|Bacillales|Staphylococcaceae|Staphylococcus|Staphylococcus hominis 87.0901
134536 species 2|1224|28216|80840|119060|1822464|134536 Bacteria|Pseudomonadota|Betaproteobacteria|Burkholderiales|Burkholderiaceae|Paraburkholderia|Paraburkholderia caledonica 12.9099
However, looking at the binning file, only 2 out of 8293 reads are assigned a TAXID. Does this mean that the abundance estimation is based on only these two reads?
JensUweUlrich commented
Yes, for the CAMI report file, your assumption is correct. In such a case, I recommend to use the sequence abundance file, which also takes into account the unclassified reads for abundance estimation in the whole sample.
dalofa commented
Thanks a ton!