
Dataset Release

Closed this issue · 5 comments

Hi there!
I really appreciate you sharing your work with the community. I was wondering if you have any plans to release the training dataset. Access to this data would be very helpful for testing different configurations during training and ensuring fair comparisons with your reported results.

Could you let me know if and when the dataset might become available?


Uploading it now :)

d-rau commented

any plans on releasing the hard negatives too?

They have been out on my HF account (Manu) for a few weeks now, but are probably not that great which is why they are not official!
They are just mined with Bipali basically

d-rau commented

Sorry missed that, thanks!