snap-stanford/GEARS

Directly implementing gears on scRNA-seq data

Closed this issue · 3 comments

Hi,
Thank you for publishing such an excellent paper! I'm new to perturb-seq and only have single cell transcriptome data. I wonder whether I can use gears directly on the scRNA-seq data to infer the perturbation. Or I must provide my own perturb-seq data.
Thanks!

As I only have scRNA-seq data, so I set the condition to ctrl for all cells during step 2 in the data_tutorial.ipynb file ((2) Create your own Perturb-Seq data). I got this error:
AttributeError Traceback (most recent call last)
Cell In[50], line 7
----> 7 pert_data.new_data_process(dataset_name = scRNA', adata = adata) # specific dataset name and adata object

File [~/anaconda3/lib/python3.8/site-packages/gears/pertdata.py:250), in PertData.new_data_process(self, dataset_name, adata, skip_calc_de)
248 os.mkdir(save_data_folder)
249 self.dataset_path = save_data_folder
--> 250 self.adata = get_DE_genes(adata, skip_calc_de)
251 if not skip_calc_de:
252 self.adata = get_dropout_non_zero_genes(self.adata)

File ~/anaconda3/envs/Tres/lib/python3.8/site-packages/gears/data_utils.py:64, in get_DE_genes(adata, skip_calc_de)
62 adata.obs = adata.obs.astype('category')
63 if not skip_calc_de:
---> 64 rank_genes_groups_by_cov(adata,
65 groupby='condition_name',
66 covariate='cell_type',
67 control_group='ctrl_1',
68 n_genes=len(adata.var),
69 key_added = 'rank_genes_groups_cov_all')
...
631 'pvals_adj': 'float64',
632 }
634 for col in test_obj.stats.columns.levels[0]:

AttributeError: 'NoneType' object has no attribute 'columns'

I think this is because I only provide one condition. Does this mean that I can not directly implement Gears on non-perturb-seq data?

yhr91 commented

Sorry for the delayed response!

It's hard to follow exactly which dataset you used for training but if I understand correctly you tried using non-perturb seq data to train GEARS. This will not work as GEARS needs perturbational data, ideally from a few different genetic perturbations.

If I misunderstood your question, please feel free to re-open this issue.

I too have the same question as @XiaoMi93!

I was wondering if one can use one of the pre-trained models (on Perturb-seq data) and apply them to scRNA-seq data to infer possible transcriptional response to a list of known/given perturbations; not to train the model per se but to apply the ones already trained on a different data source, despite them using different cell types and tissue just to inquire about possible similarities.

Thanks in advance!