Make SPOTlight results reproducible + use all input snRNA-seq data

See https://jhu-genomics.slack.com/archives/C01EA7VDJNT/p1671588769748179 for the full description.

Basically, over the winter break it'd be nice to have SPOTlight re-run using:

all the input snRNA-seq data if possible (like try with caracol if it needs lots of memory)
if we can't use all the input snRNA-seq data, then try with up to 1000 nuclei per cell type instead of the up to 100 that you are using now + add a set.seed() call to make the results reproducible.

Aka, a set.seed() before

spatialDLPFC/code/spot_deconvo/04-spotlight/02-nonIF.R

Lines 145 to 146 in 1b4d8e7

    
           #   This was slightly changed from the tutorial for simplicity 
        
           cs_keep <- lapply(idx, function(i) sample(i, min(length(i), n_cells_per_type)))

if needed. Or well, avoid

spatialDLPFC/code/spot_deconvo/04-spotlight/02-nonIF.R

Line 60 in 1b4d8e7

n_cells_per_type <- 100

altogether.

(Applies for both IF and non-IF data)

Thanks!

After reading the docs for SPOTlight, @Nick-Eagles @lahuuki and me agree what we should use all the data since they do say you do need more if your cell types are related, which is the case in our layer-level analysis.

https://bioconductor.org/packages/release/bioc/vignettes/SPOTlight/inst/doc/SPOTlight_kidney.html

	# This was slightly changed from the tutorial for simplicity
	cs_keep <- lapply(idx, function(i) sample(i, min(length(i), n_cells_per_type)))