- Download data from https://drive.google.com/file/d/1Amh8Tp3rM0kdThJ0Idd88FlGRmuwaK6o/view
- Install coccoc-tokenizer https://github.com/coccoc/coccoc-tokenizer
- Run
$ mpic++ -O2 -std=c++17 search.cpp -o search
- Run
$ mpic++ -O2 -std=c++17 split.cpp -o split
- Run
$ mkdir articles && zcat vi_wiki_all.gz | ./split
- Run
$ mpirun -np 3 ./search ./articles 100000 <queries.txt