Question about how to generate condition colname?

Question

Question about how to generate condition colname?

Closed this issue 9 months ago · 3 comments

Hi Yusuf et al.

Assuming my adata has 3000 cells, and i want to perturba two genes(geneA & geneB, i know it's not well to train models with few genes). is the following code reasonable.

tem = []
tem.extend(list(np.repeat("geneA+ctrl", 1000)))
tem.extend(list(np.repeat("geneB+ctrl", 1000)))
tem.extend(list(np.repeat("ctrl", 1000)))
adata.obs = adata.obs.assign(condition = tem)

3000 cells, 1000 cells set as ctrl, and the remaining cells are evenly distributed to geneA and geneB, is it okay to do this?

Answer 1 · 2024-04-10T08:07:28.000Z

Syntactically this is fine, but may not be a strong model

Answer 2 · 2024-04-10T09:32:41.000Z

without considering other factors, can I use above code generate condition under this assumption？

Answer 3 · 2024-04-11T07:08:10.000Z

your data:

my data:

in one of your adata.obs.condition, the distribution of number of perturbations and 'ctrl' seems to have no pattern at all;
in my own adata.obs.condition, i use the following code generate condition, cell number of perturbations and 'ctrl' are almost equal;

my question:

can the above code be used to generate condition?
does the distribution of cell numbers of perturbations affect the results, if so, how to determine the distribution of cell numbers of perturbations?

looking forward to your reply.