This indexes the Cohere v3 Wikipedia dataset using JVector.
Edit download.py
with the location you want to save the 180GB dataset.
Then edit Main.java with the corresponding location.
Run Main
class (no maven targets, easiest is to import it to your ide)