PhySci/ContrastivePredictiveCoding

Positive and negative samples

Opened this issue · 3 comments

Hi,

Very nice and clean code. However, as far as I can tell there is only 1 positive sample representing the future observations in the code while the paper uses 1 positive sample as well as N-1 randomly sampled negative samples for the NCE loss?

/Johan

Hi @strombom
Many thanks for your comment. You are absolutely right - we need one positive sample and a few negative ones to train the model.
To make calculations more efficient I introduced one trick in the code. Let's consider training batch as a bundle of sample from different categories (!). In this case each category has one positive sample and a few negative samples (rest of the batch). Therefore, we can calculate NCE loss across the batch.
In order to implement this idea special type of data generator has been written.

class ContrastiveDataGenerator(Sequence):

Did I explain my approach? If not, I will draw a simple diagram to show it.

Hi,

Thanks, I think I understand now.

One question about this line:
contrastive_batch[i, :, :] = batch[self.context_samples:self.context_samples+self.contrastive_samples, :]

Do I understand correctly that the negative samples are always the consecutive samples following right after the positive sample in the data file? Shouldn't the negative samples be sampled from random positions?

Johan

Hi,
Yes, you are right. So far, position of positive samples are not random, but fixed. I don't know how it impacts on the quality.