mikessh/migec

losing MIGs during AssembleBatch

jsabatino37 opened this issue · 2 comments

Hi,
We recently performed a MIGEC analysis on 100,000 PBMCs resulting in 24164 UMIs, but this was reduced to 6283 MIGs during AssembleBatch as shown:

Processed 24164 MIGs, 1445013 reads total, 12809 collisions detected, assembled so far: Fresh_100k_TCRb_R1.t3.cf 6333 MIGs, 1401877 reads; Fresh_100k_TCRb_R2.t3.cf 6295 MIGs, 1317583 reads; Overall 6283 MIGs, 1440694 reads

Here is the bash script we are using:
MIGEC="/data/programs/migec-1.2.7/migec"
$MIGEC Checkout -cute barcodes.txt *fastq.gz checkout/
$MIGEC Histogram --only-first-read checkout/ histogram/
$MIGEC AssembleBatch --force-overseq 3 --only-first-read --force-collision-filter checkout/ histogram/ assemble/
$MIGEC CdrBlastBatch -R TRA,TRB,TRG,TRD checkout/ assemble/ cdrblast/
$MIGEC FilterCdrBlastResultsBatch cdrblast/ cdrfinal/

We are trying to figure out what is causing the loss of MIGs. Below are the collision and assemble log reports.

collision1.txt
assemble.log.txt

We have tried adjusting --force-overseq (from 3 to 1), --collision-ratio, and removing --force-collision-filter, with little effect. This was an asymmetric paired-end run (400+100) and we are using the --only-first-read option.

Any suggestions on what may be causing the MIG dropout would be appreciated.

Well, this is expected for asymmetric run. You should check logs for the Assemble routine to see how many MIGs are filtered for R1/R2 and why. It is likely that most MIGs are filtered as they don't have enough supporting reads in one of the reads

Hi Mikhail,
Thank you for the comments. According to the assemble log, we're losing a ton of MGIs as a result of filtering. In regards to the asymmetric run question, I've run the exact same library using a symmetric paired end 150+150 and got nearly the identical MIG output with the same loss of MIGs. Is there a way to get more information on what MIGs were filtered out and why? It doesn't appear to be a sequencing depth issue as you can see from the attached report. According to the report, 26% of the UMIs were filtered due to not passing the MIG size threshold. Appreciate any additional thoughts.

Best,
Joe

joe_tcrb_migec_summary.pdf