NGSEP/NGSEPcore

Low coverage samples

kcleal opened this issue · 3 comments

kcleal commented

Hi,
Im trying to run NGSEP on a single PacBio Sequel II flow cell, e.g. SRR10382244 from SRA (coverage 7.6x). Data is aligned using minimap2 and Im calling SVs using:

java -jar NGSEPcore_4.3.2.jar SingleSampleVariantsDetector -runOnlySVs -runLongReadSVs -i SRR103822${i}.mm2.bam -r hs37d5.fa -o HG002_${i}.pacbio.NGSEP.vcf

However I seem to be getting quite low precision and recall, compared to what was expected. Are there any options to use to help call low coverage samples?

kcleal commented

Actually, I just figured out what was happening, I forgot to use the -p 0 option when using Truvari.

Hi Kez

I had responded before seeing you figured it out. Please let us know any further inquiries about the tool. Thank you for your interest.

kcleal commented

Hi @ngaitan55,

Thanks for the reply. I did manage to replicate these results.

{
    "TP-base": 8632,
    "TP-comp": 8632,
    "FP": 564,
    "FN": 1009,
    "precision": 0.9386689865158765,
    "recall": 0.895342806762784,
    "f1": 0.9164941338854383,
    "base cnt": 9641,
    "comp cnt": 9196,
    "TP-comp_TP-gt": 7897,
    "TP-comp_FP-gt": 735,
    "TP-base_TP-gt": 7897,
    "TP-base_FP-gt": 735,
    "gt_concordance": 0.9148517145505097,
    "gt_matrix": {
        "(1, 1)": {
            "(0, 1)": 183,
            "(1, 1)": 4017
        },
        "(0, 1)": {
            "(1, 1)": 552,
            "(0, 1)": 3880
        }
    }
}