nanoporetech/katuali

Training medaka variant calling model

Psy-Fer opened this issue · 3 comments

Hello,

I'd like to train some new medaka variant calling models.
For example, i'd like to train on for human. 2 questions.

  1. How do I go about this, and what data do I need?

  2. can I train on just a region of a human genome?

Any further information on how the model is created would be awesome.

Thanks!

Oh, just to add. I read through the documentation, and It isn't really clear what the data used are, or how the actual training works. This is where I would like more specific clarification.

Cheers.

Hi @Psy-Fer,

Apologies for not answering before, and thanks for reporting the lack of clarity in the documentation. We are on the process of improving katuali and its docs, and will include instructions on how to train SNP medaka models.

Hi @Psy-Fer,
We've just released v0.3.1 which includes pipeline to train medaka variant calling models. See https://nanoporetech.github.io/katuali/medaka_train_variant.html for details.