Question about how to generate condition colname?
Closed this issue · 3 comments
Hi Yusuf et al.
Assuming my adata has 3000 cells, and i want to perturba two genes(geneA & geneB, i know it's not well to train models with few genes). is the following code reasonable.
tem = []
tem.extend(list(np.repeat("geneA+ctrl", 1000)))
tem.extend(list(np.repeat("geneB+ctrl", 1000)))
tem.extend(list(np.repeat("ctrl", 1000)))
adata.obs = adata.obs.assign(condition = tem)
3000 cells, 1000 cells set as ctrl, and the remaining cells are evenly distributed to geneA and geneB, is it okay to do this?
Syntactically this is fine, but may not be a strong model
without considering other factors, can I use above code generate condition
under this assumption?
in one of your adata.obs.condition
, the distribution of number of perturbations and 'ctrl' seems to have no pattern at all;
in my own adata.obs.condition
, i use the following code generate condition
, cell number of perturbations and 'ctrl' are almost equal;
my question:
- can the above code be used to generate
condition
? - does the distribution of cell numbers of perturbations affect the results, if so, how to determine the distribution of cell numbers of perturbations?
looking forward to your reply.