What changed between the preprint (2020) and final version (2021) of the Atlas?
nigiord opened this issue · 0 comments
Hi and thank you for providing TICA.
I’ve been using it to annotate scRNAseq datasets from purified Multiple Myeloma samples, in order to identify and remove cells that were not properly "purified-out" before sequencing. I was originally using the downsampled 2020 (preprint) version of the TICA and SingleR and got results that made sense (90-99% of the cells were B cells, Plasma B cells or Proliferative B cells depending on the sample of interest). However, when I tried to update and use the final version of the TICAtlas, the annotation stopped making any sense (see example below).
2020 version:
class | counts | proportion |
---|---|---|
B cells | 8 | 0.0018497 |
Plasma B cells | 4312 | 0.9969942 |
Proliferative B cells | 2 | 0.0004624 |
Th17 cells | 3 | 0.0006936 |
Same sample annotated with the 2021 version:
class | counts | proportion |
---|---|---|
B cells | 49 | 0.0113295 |
B cells proliferative | 4 | 0.0009249 |
CD4 effector memory | 11 | 0.0025434 |
CD4 naive-memory | 16 | 0.0036994 |
CD4 recently activated | 633 | 0.1463584 |
CD4 transitional memory | 16 | 0.0036994 |
CD8 cytotoxic | 342 | 0.0790751 |
CD8 effector memory | 226 | 0.0522543 |
CD8 pre-exhausted | 30 | 0.0069364 |
CD8 terminally exhausted | 77 | 0.0178035 |
Macro. and mono. prolif. | 2 | 0.0004624 |
Macrophages SPP1 | 381 | 0.0880925 |
Mast cells | 2294 | 0.5304046 |
mDC | 11 | 0.0025434 |
Monocytes | 36 | 0.0083237 |
NK | 1 | 0.0002312 |
Plasma B cells | 42 | 0.0097110 |
T cells naive | 37 | 0.0085549 |
T cells proliferative | 6 | 0.0013873 |
T cells regulatory | 46 | 0.0106358 |
T helper cells | 17 | 0.0039306 |
TAMs C1QC | 46 | 0.0106358 |
TAMs proinflamatory | 2 | 0.0004624 |
I’m not sure what I’m doing wrong. Interestingly, when I try to annotate the 2021 object using the 2020 object as reference, I get the following result:
class | counts | proportion |
---|---|---|
B cells | 128 | 0.0512 |
cDC | 16 | 0.0064 |
Cytotoxic CD8 T cells | 73 | 0.0292 |
Effector memory CD8 T cells | 10 | 0.0040 |
M2 TAMs | 143 | 0.0572 |
mDC | 192 | 0.0768 |
Monocytes | 58 | 0.0232 |
Naive T cells | 739 | 0.2956 |
pDC | 20 | 0.0080 |
Plasma B cells | 62 | 0.0248 |
Proinflamatory TAMs | 3 | 0.0012 |
Proliferative B cells | 25 | 0.0100 |
Proliferative monocytes and macrophages | 53 | 0.0212 |
Proliferative T cells | 110 | 0.0440 |
Regulatory T cells | 126 | 0.0504 |
T helper cells | 713 | 0.2852 |
Terminally exhausted CD8 T cells | 3 | 0.0012 |
Th17 cells | 25 | 0.0100 |
Transitional memory CD4 T cells | 1 | 0.0004 |
...even though I have initially 100 cells of each subtype. Same results when annotating the 2020 object using 2021 as reference:
class | counts | proportion |
---|---|---|
B cells | 72 | 0.0028993 |
B cells proliferative | 758 | 0.0305227 |
CD4 effector memory | 262 | 0.0105501 |
CD4 naive-memory | 680 | 0.0273818 |
CD4 recently activated | 2842 | 0.1144399 |
CD4 transitional memory | 532 | 0.0214222 |
CD8 cytotoxic | 178 | 0.0071676 |
CD8 effector memory | 35 | 0.0014094 |
CD8 pre-exhausted | 18 | 0.0007248 |
CD8 terminally exhausted | 76 | 0.0030603 |
cDC | 43 | 0.0017315 |
Macro. and mono. prolif. | 451 | 0.0181606 |
Macrophages SPP1 | 11907 | 0.4794636 |
Mast cells | 5391 | 0.2170814 |
mDC | 72 | 0.0028993 |
Monocytes | 317 | 0.0127648 |
NK | 52 | 0.0020939 |
pDC | 21 | 0.0008456 |
Plasma B cells | 63 | 0.0025368 |
T cells naive | 147 | 0.0059193 |
T cells proliferative | 41 | 0.0016510 |
T cells regulatory | 341 | 0.0137312 |
T helper cells | 313 | 0.0126037 |
TAMs C1QC | 133 | 0.0053556 |
TAMs proinflamatory | 89 | 0.0035838 |
(expected: 1000 cells of each subtype)
I also tried to annotate my samples with the full 2021 Atlas (takes forever...) but the problem stays the same, so it does not seem to be linked to the fact that the new downsampled version contains 10x less cells. Furthermore, the huge discrepancies when annotating one Atlas with the other is rather suspicious.
Any idea of what’s happening?
Cheers,
Nils