Batch Correction/Integration Compatibility
Opened this issue · 0 comments
Hello! Thank you for this software package. I was wondering if it would be possible to utilize SCArray for the purposes of batch correction. I have a dataset containing 921 samples and over 4 million cells with both raw and normalized, imputed counts. This is saved in 2 separate BPCells matrices, which have actually been divided into their original sample of origin (resulting in 1842 count matrices total -- 2 for each sample, 1 raw and 1 normalized/imputed). I would like to perform integration across all of these samples using the Seurat RPCA protocol, but I can barely create the initial RNA assay with all of these samples separated into individual layers, even with each sample's corresponding BPCells (on-disk) matrix taking the place of a sparse matrix that has been loaded into memory. My computer has 64 Gb of memory, 4 cores, and runs on Windows 11 using Seurat version 5.1.0. My questions for you are as follows:
- Can I load in the individual BPCells matrices into an SCArray-formatted assay? Both the raw counts and normalized counts for each sample.
- Is this assay in fact splittable like the assays in Seurat v5, or do all cells from all samples need to be joined in the same count matrix?
- Can I use the pipeline from FindVariableFeatures to ScaleData to RunPCA to IntegrateLayers in Seurat v5 on the SCArray assay?
Thank you so much!
Best,
Skanda