bunnech/cellot

Crossspecies: unable to reproduce results for CellOT

Opened this issue · 4 comments

Hello: I'm trying to reproduce results for cross species for CellOT (using rats as starting point), but ran into an issue.

The steps I tried are as follows:

  • Get the embedding from model-scgen as follows: --outdir ./results/scrna-crossspecies/mode-iid/model-scgen --config ./configs/tasks/crossspecies.yaml --config ./configs/models/scgen.yaml
  • Use the embedding from scgen and apply to CellOT: --outdir ./results/scrna-crossspecies/mode-iid/model-cellot --config ./configs/tasks/crossspecies.yaml --config ./configs/models/cellot.yaml --config.data.ae_emb.path ./results/scrna-crossspecies/mode-iid/model-scgen

Once stored the result, I evaluated via the following: --outdir results/scrna-crossspecies/mode-iid/model-cellot --n_markers 50 --setting iid --where data_space

The results I get are:

'mmd': 0.4460518822912073, 'l2': 15.850724, 'r2': 0.4929862534733026

What have I done wrongly? For reference, I got the following for identity, which seems to make more sense:
'mmd': 0.20872688110292562, 'l2': 11.169688, 'r2': 0.7255046739934895

Thank you!

So we only tested the setting mentioned in the paper itself and copied what the authors of scGen are doing. Meaning, we only tested rat and mouse as holdout cell type. Are you doing something else?

Ah but you are anyway doing iid. What is worth mentioning is that the results in the paper are in ood mode. But nevertheless, you should get different results in iid. Why would you train iid models here though?

Something went wrong in that case but it is hard to narrow down what. If you get an r2 of ~50% it seems that your initial embedding (and then ultimately the decoder) are not trained properly. If you want, I can share with you trained models (in case I can still access them and I actually have them for iid).

Hi Charlotte, thank you for the reply! Yes I only tried the rat one (this is the only one given in the config).

And yes would be great if you could share to me the trained scgen model :) I tried evaluating the performance of autoencoder models that I trained via the configs you provided (scgen / cae) but get dimension error. Is this the correct command to evaluate?

python ./scripts/evaluate.py --outdir results/scrna-crossspecies/mode-iid/model-scgen --n_markers 50 --setting iid --where data_space