10XGenomics/enclone

Fix for barcode reuse?

Opened this issue · 2 comments

Dear developers,

I wanted to use your tool enclone on 2 bcr files from 1 donor, before and after treatment.
However, I get following message:

Significant barcode reuse detected.  If at least 25% of the barcodes in one dataset
are present in another dataset, is is likely that two datasets arising from the
same library were included as input to enclone.  Since this would normally occur
only by accident, enclone exits.  If you wish to override this behavior,
please rerun with the argument ACCEPT_REUSE.

Here are the instances of reuse that were observed:

bcrPBMC_JUQ064_hg19, bcrPBMC_JUQ065_hg19 ==> 245 of 972, 829 barcodes (29.6%)

The barcodes don't have a prefix or suffix, so it's only normal that some are present in both.
Is there an argument I can set that adds a string to the barcode using the META csv file?

Thanks in advance,
Aurelie

nh3 commented

Hello,

I encountered the same warning but under a different experiment design. We used cell-hashing to pool different donors and demultiplexed the data according to https://www.10xgenomics.com/resources/analysis-guides/demultiplexing-and-analyzing-5%E2%80%99-immune-profiling-libraries-pooled-with-hashtags . Then, when running enclone across each demultiplexed sample's cellranger-multi outputs using the "p1;p2;p3" notion, it complains "Significant barcode reuse detected". It appears that "all_contig_annotations.json" generated from demultiplexed samples contain the same set of contigs only differ in the is_cell/is_gex_cell/is_asm_cell fields, and enclone seems to ignore those. Is it valid to filter down to cell-only contigs and supply those to enclone?

Many thanks!