Single-Cell-Genomics-Group-CNAG-CRG/Tumor-Immune-Cell-Atlas

What changed between the preprint (2020) and final version (2021) of the Atlas?

nigiord opened this issue · 0 comments

Hi and thank you for providing TICA.

I’ve been using it to annotate scRNAseq datasets from purified Multiple Myeloma samples, in order to identify and remove cells that were not properly "purified-out" before sequencing. I was originally using the downsampled 2020 (preprint) version of the TICA and SingleR and got results that made sense (90-99% of the cells were B cells, Plasma B cells or Proliferative B cells depending on the sample of interest). However, when I tried to update and use the final version of the TICAtlas, the annotation stopped making any sense (see example below).

2020 version:

class counts proportion
B cells 8 0.0018497
Plasma B cells 4312 0.9969942
Proliferative B cells 2 0.0004624
Th17 cells 3 0.0006936

Same sample annotated with the 2021 version:

class counts proportion
B cells 49 0.0113295
B cells proliferative 4 0.0009249
CD4 effector memory 11 0.0025434
CD4 naive-memory 16 0.0036994
CD4 recently activated 633 0.1463584
CD4 transitional memory 16 0.0036994
CD8 cytotoxic 342 0.0790751
CD8 effector memory 226 0.0522543
CD8 pre-exhausted 30 0.0069364
CD8 terminally exhausted 77 0.0178035
Macro. and mono. prolif. 2 0.0004624
Macrophages SPP1 381 0.0880925
Mast cells 2294 0.5304046
mDC 11 0.0025434
Monocytes 36 0.0083237
NK 1 0.0002312
Plasma B cells 42 0.0097110
T cells naive 37 0.0085549
T cells proliferative 6 0.0013873
T cells regulatory 46 0.0106358
T helper cells 17 0.0039306
TAMs C1QC 46 0.0106358
TAMs proinflamatory 2 0.0004624

I’m not sure what I’m doing wrong. Interestingly, when I try to annotate the 2021 object using the 2020 object as reference, I get the following result:

class counts proportion
B cells 128 0.0512
cDC 16 0.0064
Cytotoxic CD8 T cells 73 0.0292
Effector memory CD8 T cells 10 0.0040
M2 TAMs 143 0.0572
mDC 192 0.0768
Monocytes 58 0.0232
Naive T cells 739 0.2956
pDC 20 0.0080
Plasma B cells 62 0.0248
Proinflamatory TAMs 3 0.0012
Proliferative B cells 25 0.0100
Proliferative monocytes and macrophages 53 0.0212
Proliferative T cells 110 0.0440
Regulatory T cells 126 0.0504
T helper cells 713 0.2852
Terminally exhausted CD8 T cells 3 0.0012
Th17 cells 25 0.0100
Transitional memory CD4 T cells 1 0.0004

...even though I have initially 100 cells of each subtype. Same results when annotating the 2020 object using 2021 as reference:

class counts proportion
B cells 72 0.0028993
B cells proliferative 758 0.0305227
CD4 effector memory 262 0.0105501
CD4 naive-memory 680 0.0273818
CD4 recently activated 2842 0.1144399
CD4 transitional memory 532 0.0214222
CD8 cytotoxic 178 0.0071676
CD8 effector memory 35 0.0014094
CD8 pre-exhausted 18 0.0007248
CD8 terminally exhausted 76 0.0030603
cDC 43 0.0017315
Macro. and mono. prolif. 451 0.0181606
Macrophages SPP1 11907 0.4794636
Mast cells 5391 0.2170814
mDC 72 0.0028993
Monocytes 317 0.0127648
NK 52 0.0020939
pDC 21 0.0008456
Plasma B cells 63 0.0025368
T cells naive 147 0.0059193
T cells proliferative 41 0.0016510
T cells regulatory 341 0.0137312
T helper cells 313 0.0126037
TAMs C1QC 133 0.0053556
TAMs proinflamatory 89 0.0035838

(expected: 1000 cells of each subtype)

I also tried to annotate my samples with the full 2021 Atlas (takes forever...) but the problem stays the same, so it does not seem to be linked to the fact that the new downsampled version contains 10x less cells. Furthermore, the huge discrepancies when annotating one Atlas with the other is rather suspicious.

Any idea of what’s happening?

Cheers,
Nils