AbbVie-ComputationalGenomics/SCArray

Batch Correction/Integration Compatibility

Opened this issue · 0 comments

Hello! Thank you for this software package. I was wondering if it would be possible to utilize SCArray for the purposes of batch correction. I have a dataset containing 921 samples and over 4 million cells with both raw and normalized, imputed counts. This is saved in 2 separate BPCells matrices, which have actually been divided into their original sample of origin (resulting in 1842 count matrices total -- 2 for each sample, 1 raw and 1 normalized/imputed). I would like to perform integration across all of these samples using the Seurat RPCA protocol, but I can barely create the initial RNA assay with all of these samples separated into individual layers, even with each sample's corresponding BPCells (on-disk) matrix taking the place of a sparse matrix that has been loaded into memory. My computer has 64 Gb of memory, 4 cores, and runs on Windows 11 using Seurat version 5.1.0. My questions for you are as follows:

  1. Can I load in the individual BPCells matrices into an SCArray-formatted assay? Both the raw counts and normalized counts for each sample.
  2. Is this assay in fact splittable like the assays in Seurat v5, or do all cells from all samples need to be joined in the same count matrix?
  3. Can I use the pipeline from FindVariableFeatures to ScaleData to RunPCA to IntegrateLayers in Seurat v5 on the SCArray assay?

Thank you so much!
Best,
Skanda