Detect minor SNVs not clear enough
gchevignon opened this issue · 2 comments
Hello,
Can you please provide more detail on how to chose a contextmodel ?
Also what type of file should we use as contextmodel ?
- boosting.conf ?
- boosting.model ?
- a folder ?
- What folder ? :
- train_A
- train_C
- train_G
- train_T
- or just ont or pacbio ?
Thanks
Hi gchevignon,
The input of context model is just the whole folder containing "train_A, strain_C, ...". You could see the example in README.md showing what the script looks like. I paste the link here for your convenience. https://www.dropbox.com/home/public/iGDA_examples/pacbio_ecoli/script?preview=igda_detect.sh
The context model for PacBio and ONT are different, they in different folders https://github.com/zhixingfeng/igda_contextmodel.
For PacBio, x in qv_x_NCTC_P6_C4 means the QV threshold that base below x will be masked as N. x=0 means no masking. For ONT, If your data is preprocessed by discarding reads with average QV < x and masking bases with QV < y by "N", use the model named "ont_context_effect_read_qv_x_base_qv_y". "qv_0" means no masking. The "sam_maskqv" command released with iGDA can do the low QV base masking.
Hope this helps.
Best
Zhixing
Yep this help.
Thanks!