/natural-questions-environment

Docker Environment for Google Natural Questions

Primary LanguagePython

natural-questions-environment

Docker Environment for Google Natural Questions

Prerequisite

  • gsutil
  • docker

Dataset

In the host machine, download the dataset:

gsutil -m cp -R gs://natural_questions/v1.0 /root/Project/NaturalQuestionsData

Run Container

In the host machine, mount dataset and run the image:
docker run -v ~/Project/NaturalQuestionsData:/root/data -it zzj0402/natural-questions-environment bash

Prepare Data

Inside the Docker container, run:

cd /root/language/ && bash /root/language/prepare-data.sh

Combined TF-Records

There will be a combined version of all the data record shards(nq-train.tfrecords.combined) located in the host environment natural_questions/v1.0