What changed between the preprint (2020) and final version (2021) of the Atlas?

Question

What changed between the preprint (2020) and final version (2021) of the Atlas?

nigiord opened this issue a year ago · 0 comments

Hi and thank you for providing TICA.

I’ve been using it to annotate scRNAseq datasets from purified Multiple Myeloma samples, in order to identify and remove cells that were not properly "purified-out" before sequencing. I was originally using the downsampled 2020 (preprint) version of the TICA and SingleR and got results that made sense (90-99% of the cells were B cells, Plasma B cells or Proliferative B cells depending on the sample of interest). However, when I tried to update and use the final version of the TICAtlas, the annotation stopped making any sense (see example below).

2020 version:

class	counts	proportion
B cells	8	0.0018497
Plasma B cells	4312	0.9969942
Proliferative B cells	2	0.0004624
Th17 cells	3	0.0006936

Same sample annotated with the 2021 version:

class	counts	proportion
B cells	49	0.0113295
B cells proliferative	4	0.0009249
CD4 effector memory	11	0.0025434
CD4 naive-memory	16	0.0036994
CD4 recently activated	633	0.1463584
CD4 transitional memory	16	0.0036994
CD8 cytotoxic	342	0.0790751
CD8 effector memory	226	0.0522543
CD8 pre-exhausted	30	0.0069364
CD8 terminally exhausted	77	0.0178035
Macro. and mono. prolif.	2	0.0004624
Macrophages SPP1	381	0.0880925
Mast cells	2294	0.5304046
mDC	11	0.0025434
Monocytes	36	0.0083237
NK	1	0.0002312
Plasma B cells	42	0.0097110
T cells naive	37	0.0085549
T cells proliferative	6	0.0013873
T cells regulatory	46	0.0106358
T helper cells	17	0.0039306
TAMs C1QC	46	0.0106358
TAMs proinflamatory	2	0.0004624

I’m not sure what I’m doing wrong. Interestingly, when I try to annotate the 2021 object using the 2020 object as reference, I get the following result:

class	counts	proportion
B cells	128	0.0512
cDC	16	0.0064
Cytotoxic CD8 T cells	73	0.0292
Effector memory CD8 T cells	10	0.0040
M2 TAMs	143	0.0572
mDC	192	0.0768
Monocytes	58	0.0232
Naive T cells	739	0.2956
pDC	20	0.0080
Plasma B cells	62	0.0248
Proinflamatory TAMs	3	0.0012
Proliferative B cells	25	0.0100
Proliferative monocytes and macrophages	53	0.0212
Proliferative T cells	110	0.0440
Regulatory T cells	126	0.0504
T helper cells	713	0.2852
Terminally exhausted CD8 T cells	3	0.0012
Th17 cells	25	0.0100
Transitional memory CD4 T cells	1	0.0004

...even though I have initially 100 cells of each subtype. Same results when annotating the 2020 object using 2021 as reference:

class	counts	proportion
B cells	72	0.0028993
B cells proliferative	758	0.0305227
CD4 effector memory	262	0.0105501
CD4 naive-memory	680	0.0273818
CD4 recently activated	2842	0.1144399
CD4 transitional memory	532	0.0214222
CD8 cytotoxic	178	0.0071676
CD8 effector memory	35	0.0014094
CD8 pre-exhausted	18	0.0007248
CD8 terminally exhausted	76	0.0030603
cDC	43	0.0017315
Macro. and mono. prolif.	451	0.0181606
Macrophages SPP1	11907	0.4794636
Mast cells	5391	0.2170814
mDC	72	0.0028993
Monocytes	317	0.0127648
NK	52	0.0020939
pDC	21	0.0008456
Plasma B cells	63	0.0025368
T cells naive	147	0.0059193
T cells proliferative	41	0.0016510
T cells regulatory	341	0.0137312
T helper cells	313	0.0126037
TAMs C1QC	133	0.0053556
TAMs proinflamatory	89	0.0035838

(expected: 1000 cells of each subtype)

I also tried to annotate my samples with the full 2021 Atlas (takes forever...) but the problem stays the same, so it does not seem to be linked to the fact that the new downsampled version contains 10x less cells. Furthermore, the huge discrepancies when annotating one Atlas with the other is rather suspicious.

Any idea of what’s happening?

Cheers,
Nils