Open questions regarding implementation details

Question

Open questions regarding implementation details

Opened this issue 4 years ago · 1 comments

If two MS-features, e.g. within one sequence, have them same candidate sets, should the randomly sub-sampled candidate sets, e.g. used during training, be identical for both features?
Is it sufficient to calculate the average, e.g. top-k, accuracy over the sequences in the sample? Thereby we first calculate the average, e.g. top-k, accuracy over the sequence and subsequently average over the samples.

Answer 1 · 2021-05-26T11:30:46.000Z

Answer to first question: I think it does not really matter. My experiments showed that we can even have different random candidate subsets each time a specific spectrum re-appears in a training sequence. Only in during testing this matters, but that we anyway do not randomly sub-sample.