Barcode demultiplexing on the long-reads alignment?
Opened this issue · 4 comments
Hi,
I was wondering if I could use this method to help me quantify my Nanopore library with barcode sequence?
I have already demultiplexed the ONT library to each individual barcode by using some other tools and generated a meta table with matched barcode and readID.
It would be great if you guys have any ideas on how can I generate a barcode-gene count matrix from it.
My current workflow is aligning via minimap2 and subset the bam file to each barcode by matching the readID, and use Salmon or other tools to quantify the counts, and compile the matrix together. But it took a very long time and memory.
Maybe there's another more efficient way.
Thanks!
Shaowen
Hi Shaowen,
How many barcodes do you have?
Kind Regards,
Andre Sim
Hi Andre,
I have almost 200,000 barcodes. It's similar to single-cell but not the same sequence structure as 10X
Thanks,
Shaowen
Hi Shaowen,
With that amount of barcodes/bam files, using the current version of Bambu, you will encounter some computational resource issues. We are currently working on a way to handle this, preferably in a way where you do not need to subset the bam file. I hope to be able to update you on its progress in the coming month.
Are your barcodes stored in the read name/the BC tag in the bam file or only in a seperate metadata file?
Kind Regards,
Andre Sim