Advise without HTODemux
Closed this issue · 2 comments
Hi @MattPM, this looks like a nice package ;)
I was wondering if you could give me some advice to use your method.
To get singlets and negatives I have to run HTODemux right? my problem is that the assumption in HTODemux is that each sample is "single positive" for an HTO, so it's quite straightforward to assign a sample to one HTO. In my case, I have a multi-positive situation (e.g., 6 total HTOs and each sample is double-positive or triple-positive, see Figure 4 in https://www.nature.com/articles/nprot.2015.020 for an example in cytof).
What I actually want to do is normalize HTO and Ab counts, because I think both suffer from the same background noise. I saw in your vignette there was another way of doing this using low expression cells. Do you think this would make sense to normalize the HTOs, so I can demultiplex (on my own) with a multi-positive setup?
Hi thanks for your interest.
What I actually want to do is normalize HTO and Ab counts, because I think both suffer from the same background noise. I saw in your vignette there was another way of doing this using low expression cells. Do you think this would make sense to normalize the HTOs, so I can demultiplex (on my own) with a multi-positive setup?
This is a great idea and I'd be curious to hear how it goes. The fact that each cell is positive for multiple HTO would not impact the normalization. Denoising a sample barcoding matrix with the DSB normalization using an estimate of the background should improve demultiplexing results with things like cell hashing antibodies or LMO tags a la multiseq, by centering the background at 0. You could define the ambient droplets another way (e.g. with the mRNA from CITEseq data), further QC that matrix to make sure there are no obvious protein counts in the ambient drops, then use that empty_drop_matrix
matrix to denoise both the ADT and HTO data, then demultiplex. We have had internal discussions about this but have not tested this extensively since in our data hashing provides a nice orthogonal measure of negative droplets which we then further QC, but it is worth further testing.
"To get singlets and negatives I have to run HTODemux right?"
As you mentioned above, actually you can define negative droplets any way you would like. See this part of the vignette:
https://mattpm.github.io/dsb/articles/dsb_normalizing_CITEseq_data.html#how-to-get-empty-drops-without-cell-hashing-or-sample-demultiplexing
We leave it up to the user how to define the matrix of negative drops vs proteins empty_drop_matrix
in the package since experimental set up can differ so much.
We advise people that if using cellranger, load the "raw_bc_output" file containing all possible barcodes and filter from there (depends on the platform but this will be hundreds of thousands of barcodes).