
blocklist coverage_exclusion

Closed this issue · 1 comments

When evaluating the new Henry Ford cohort (batch H2) for whether they passed the coverage_exclusion threshold in the blocklist, we noticed some inconsistencies between whether a sample should have passed in the original data freeze and whether a sample should pass now. These criteria are based off of the average "MEAN_COVERAGE" values in from the wgsmetrics.txt files for each aliquot, with samples > 2 standard deviations below the mean failing the coverage threshold. While these values are inexact for whole exome sequencing files, they offer a convenient "relative coverage" metric for evaluating them in the context of the overall dataset. Based on these criteria, we noted 6 aliquots with coverage values that were in the range of those that failed the blocklist in the original data freeze. We are marking these samples as review under coverage_exclusion in the blocklist and will return to them if they give us trouble with copy number or mutation calling.


This was fully resolved during the rerun of the H2 cohort. None of the above barcodes exist anymore due to a labeling error, and many of the underlying fastqs were merged together as it was revealed to us they were from the same tumor/region/timepoint. As a result everything in the H2 cohort passes the coverage exclusion.