pinellolab/dictys

datasets input as scATAC only

willey2020 opened this issue · 4 comments

Hello! Thank you so much for this great package!
May I ask a stupid question, if there is some datasets that are scATAC only, without scRNA. Is there anything that this package still able to infer for some steps? (Say If certain gene's accessibility score can be "used" as gene on and off as scRNA can give.)
I know this is rediculous question for multiomic package, but still be curious for those datasets that are scATAC only. Thank you!

Thank you again!

Hi willey2020,

Thank you for your interest and question.

Dictys is designed for network inference and analysis using both the data modalities. We have not tested it on scATAC-seq data alone.

However, we are open to alternative ways of using Dictys if you are aware of the limitations, which seems to be the case. In a typical biomedical investigation, I would imagine the use of multiple lines of evidences to support your conclusion. Therefore, I do not see a big issue in exploratory analysis when facing practical constraints, as long as you address these limitations and support your conclusion with additional experiments and analyses. Giving it a try and looking at the output would probably help you make the decision. Personally I am also curious how Dictys would perform in such scenarios.

Alternatively, you may seek separate scRNA-seq data from published datasets or new experiments.

Hope it helps!
Lingfei

Thank you so much Lingfei! This looks so encouraging! Could I confirm with what you mentioned here, so it means I can just provide the scATAC profile only(for example, the data processed by ArchR or STREAM) directly to Dictys, and it will still be able to calculate out the GRN in cell type-specific or dynamic scenario? Thank you so much again!!

No problem! Just to clarify in case I sounded over optimistic.

First, I assume your scATAC-seq data have good quality and the relevant analysis results make good sense. Make sure you have enough cells for the cell type (>1000, rule of thumb depending on data quality) or differentiation process (>5000) of your interest. Dictys still needs the transcriptome read count matrix, which you need to estimate with existing methods e.g. through gene activity matrix. Then, I suggest to run some of the relevant Dictys tutorials to test your software/hardware infrastructure and to understand the input/output and the process. After that, you can put your own data in place for network inference and analysis with Dictys. Finally, check the network quality with known biology first before looking for new biology, and follow up with sufficient experiments and analysis to support them further.

I should warn you of the potential time investment and the exploratory nature of this analysis. Still, trial and (hopefully no) error is the nature of scientific research. We are always driven by potential outcomes.

Let me know if you have further questions.

Thank you so much Lingfei! I will give it a try! Thank you again!