How to incorprate public available spladder gff3 result into a new run?
MeilingJinCOH opened this issue · 2 comments
- spladder version:2.4.4
- Python version: Python 3.8.1
- Operating System:CentOS Linux Version 7
Description
I downloaded GFF3 files from https://gdc.cancer.gov/about-data/publications/PanCanAtlas-Splicing-2018. I would like to incorprate another dataset to this dataset. The datasize is about 1000 samples. I am expecting the result that contains all events from the downloaded reference and new events detected in the new dataset. Please suggest how to run spladder to meet this goal?
Thanks,
Meiling
Dear @MeilingJinCOH ,
Thanks for reaching out. As a general strategy, I would suggest to follow the steps to run SplAdder on large cohorts (https://spladder.readthedocs.io/en/latest/spladder_cohort.html). In terms of events, SplAdder will only output events that have read-support from the RNA-Seq samples it is provided with. That is, if there are some events that have been detected in the PanCanAtlas cohort but that are not supported in your 1000 samples, then SplAdder will not report them.
You can change this, by treating the PanCanAtlas GFF3 file as annotation and use the option --use-anno-support
, so all events that are in the annotation are also reported, irrespective of whether they are supported by the given RNA-Seq data.
Best,
Andre
Dear Andre,
Thank you very much for the response. I believe "--use-anno-support" is what I'm looking for. Please close the issue.
Best,
Meiling