Masking Signal Points Outside of Kmer Context
Closed this issue · 2 comments
Hello I was wondering if there is a way that I could mask signal points that are outside the kmer context window.
For instance, let's say I was using chunk context [50,50] and kmer context [4,4]. Is there a way that signals mapping to bases further than +/- 4bps from the target nucleotide could be masked?
Let me know your thoughts thank you!
This is not likely to be a feature we would officially add to Remora in the near future as I am unsure how this would produce a robust model without adding this processing at inference time. I'm not sure exactly how you would like to mask these values. Are you proposing to simply remove the base annotations? Or remove the signal at these bases are well? Or somehow include them, but not allow gradients to be derived from these parts of the reads? if you are looking for operations on the signal then this bit of code would be the place to test changes.
Makes sense. My thought process was to implement masking during both training and inference (using remora infer) although I don't have clear idea of how I'd go about this. For now, looking at updating the extract chunks code works thanks for directing me.