Single-Cell-Genomics-Group-CNAG-CRG/Tumor-Immune-Cell-Atlas

counts are not integers.

bjstewart1 opened this issue · 3 comments

the counts in adata.raw.X are not all integers

library(reticulate)
sc <- import("scanpy")
adata <- sc$read_h5ad("data/TICAtlas.h5ad")
adata$X <- adata$raw$X
rs <- Matrix::rowSums(adata$X)
all(rs == as.integer(rs)) #returns false

Hello,
I was also wondering about the processing of the data that are available on Zenodo, since the counts slot in the Seurat objects are not integers. Would it be possible to have more information on the processing steps performed on the raw counts?
Thank you for your help and for the great ressources made available!
Aurélie

Hello! This is because for one dataset (breast) the "raw" data was in TPMs rather than raw counts, so it is not really raw counts, but this is the best we could get.
@aurelieGabriel regarding the processing of the raw counts, all we did was filter out non-immune cells (although we did more filtering after integration) and the rest is better detailed in the integration folder of this repository.
Hope this was helpful!

Hello,
I apologize for the delay.. Thank you for your answer, it helps indeed. I saw though that non-integer values are also found in the following "source" datasets: liver2, lung1 and melanoma1. Were those samples considered differently for the integration step and the generation of the Atlas?
Thank you for your help.
Best wishes,
Aurélie