gavinmdouglas/picrust2_manuscript

PiCrust2 categorize_by_function.py

Closed this issue · 12 comments

Hello,
I'm just wondering if there is a categorize_by_function.py script within PiCrust2 in order to collapse the pathways into higher level of the pathway hierarchy?
I've successfully produced the KO_metagenome_out, EC_metagenome_out, but I'm still only able to run and visualize on individual KO instead of functional groups.
Any thoughts?
Thank you

Hey @zina-R,

I wrote a response to this question, which you can see here: https://github.com/picrust/picrust2/wiki/Frequently-Asked-Questions#how-can-i-run-categorize_by_functionpy-like-in-picrust1

Please let me know if you need further clarification.

All the best,

Gavin

Hello @gavinmdouglas ,
Thank you for your answer, well i am confused ! I am not understanding how can i switch from picrust2 tutorial ( results: description of each KO and EC ...) which we used fasta and biom files to this function " categorize_by_function.py" . If i well understand i need to repeat all the steps (Normalize OTU Table, Predict Functions For Metagenome, Collapse predictions into pathways ) ?
Best ,

Hi @zina-R,

If you're comfortable in R you can use the R script linked to in that FAQ response. Otherwise yes you will need to run all of the PICRUSt1 scripts to get KEGG pathways in the format of "categorize_by_function.py".

Cheers,

Gavin

Hello @gavinmdouglas ,
I would like to ask you, if the biom file used in picrust2 is the same of picrust1. for example in this command :
**normalize_by_copy_number.py
--gg_version 18may2012
-i hmp_mock_16S.biom
-o normalized_otus.biom
can i use the same biom that i used in picrust2, note that i didn t use quiime to execute otu table.
Best,

Hello again @gavinmdouglas
I am trying to do it in R as mentionned in the script, i had an error : object 'kegg_brite_map' not found
while running the function.
if i well understand i need to run picrust1 to get ''kegg_brite_map ??
if yes why we are doing other methods if we need to pass by picrust1 in all cases ?? i am so confused

##kegg_brite_map <- read.table("/path/to/picrust1_KO_BRITE_map.tsv", header=TRUE, sep="\t", quote = "", stringsAsFactors = FALSE, comment.char="", row.names=1)
This file is picrust1 output ??
Best,

Hey @zina-R ,

No you don't need to run PICRUSt1. See this part of the FAQ:

"You can download the R code here: https://www.dropbox.com/s/91pohevw4ayxtn2/picrust1_categorize_by_func.R?dl=1 and the legacy table of mappings from KOs to BRITE hierarchy here (which is a required input file): https://www.dropbox.com/s/a5o4li0irsqupt3/picrust1_KO_BRITE_map.tsv?dl=1."

You need to run in picrust1_KO_BRITE_map.tsv after downloading it (which you can do by uncommenting that line you posted).

Cheers,

Gavin

Hi @gavinmdouglas
In this commande it is mention pred_metagenome_strat.tsv

pathway_pipeline.py -i KO_metagenome_out/pred_metagenome_strat.tsv -o KEGG_pathways_out --no_regroup --map picrust2/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv

I am trying to follow the script to get the strat.tv output and use it tin the previous command but i get the unstrat.tsv.gz , i already added --strat_out as below, how can i get it ?

place_seqs.py -s SWARM_OTUs_curated.fasta -o placed_seqs.tre -p 1 --intermediate placement_working

2-Hidden state prediction

hsp.py -i 16S -t placed_seqs.tre -o marker_nsti_predicted.tsv.gz -p 1 -n
hsp.py -i EC -t placed_seqs.tre -o EC_predicted.tsv.gz -p 1
hsp.py -i KO -t placed_seqs.tre -o KO_predicted.tsv.gz -p 1

3-Metagenome prediction

metagenome_pipeline.py -i SWARM_table_curated.biom
-m marker_nsti_predicted.tsv.gz
-f EC_predicted.tsv.gz
-o EC_metagenome_out --strat_out

metagenome_pipeline.py -i SWARM_table_curated.biom
-m marker_nsti_predicted.tsv.gz
-f KO_predicted.tsv.gz
-o KO_metagenome_out --strat_out

Best,

Good afternoon,
I am writing in relation to some further issues with this process.
I've followed the steps involving the R function as indicated by @gavinmdouglas and I was successfully able to reproduce them. As a result, I got three separate datasets, each one belonging to a different KO level. However, I intend to use my dataset for further analysis via LEfSe for functional biomarker analysis and for this I would like to have the information of the 3 levels all in the same data table to later create a cladogram (with taxonomic data you can include different hierarchical taxonomical levels separated by | and this is what I am trying to do with the functional data).
Could anyone help me on solving this? Does anyone know if there is a way to get this final output?
Thanks in advance, Mikel.

Hello,

I am a new user to PICRUST software package. I am running the below command, the folder "KO_metagenome_out" is made but there is no output in the folder (The folder is empty), I tried to re-run the command but got same issue, please let me know if you have any opinion about solving this issue. Thank you.

metagenome_pipeline.py -i feature-table.biom -m marker_predicted_and_nsti.tsv.gz -f KO_predicted.tsv.gz -o KO_metagenome_out --strat_out

hi there,
I have ran the command for categorizing the picrust2 output to higher level using below command
pathway_pipeline.py -i KO_metagenome_out/pred_metagenome_strat.tsv -o KEGG_pathways_out --no_regroup --map picrust2/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv
and it was successful.
Now my question is how can I convert this KEGG_pathways_out file to KEGG pathway to run in the STAMP.