Multiple root cells possible?
Opened this issue · 2 comments
Hi @flying-sheep ,
Is it possible to set multiple root cells before running pseudotime calculation with palantir?
I have multiple root stem cells in my tissue/data.
Best,
T
I’m not a member of @dpeerlab, so what’s the case
- the original API allows it while scanpy’s
external
wrapper doesn’t allow it and needs to be updated (then this should be a scanpy issue instead of here) - the palantir library itself doesn’t allow it (then it should be here, but I shouldn’t be tagged)
Hi all! I was confused about the tag too but am happy to clarify.
Pseudotime is defined from one specific start cell and approximates the time a cell would need to differentiate into any other state.
So, for the typical workflow we recommend using a single stem cell that appears to be the most stem-like. For that we often take the cell that is at the extreme for a diffusion component that coincides with the stemness. E.g., in this dataset the stem cells are at the bottom and component 0 reaches a maximum there:
To, automatically select the extreme cell you could use
early_cell = palantir.utils.early_cell(ad, celltype="stem_cell", celltype_column="celltype")
Note that this assumes you annotated some cells as "stem_cell"
in ad.obs["celltype"]
. The subsequent Palantir call could be
palantir.core.run_palantir(ad, early_cell=early_cell)
However, if you really want to compute a psuedotime for multiple root cells, then you can do that too. To compute a separate pseudotime for each root cell you could run
root_cells = [...]
for cell in root_cells:
palantir.core.run_palantir(
ad,
early_cell=cell,
pseudo_time_key=f"cell_{cell}_psuedotime",
entropy_key=f"cell_{cell}_entropy",
fate_prob_key=f"cell_{cell}_fate_probabilities",
)
After that, you would have to decide how to define a pseudotime for multiple root cells. Is it the minimal pseudotemporal distance to a root cell? Then you could set it as
ad.obs["multi_root_pseudotime"] = ad.obs[[f"cell_{cell}_psuedotime" for cell in root_cells]].min(axis=1)
Or, if it is the average pseudotemporal distance to a root cell, then you could run
ad.obs["multi_root_pseudotime"] = ad.obs[[f"cell_{cell}_psuedotime" for cell in root_cells]].mean(axis=1)
Please let me know if I got this right or if you have any further questions!