OQRA NaturalQuestions pre-trained model accuracy

Hi folks,

Thanks a lot for releasing the source code and the pre-trained models!
I was able to setup everything, and run the test evaluation code.

I downloaded the pre-trained NQ checkpoints from gs://orqa-data/orqa_nq_model
When I run

export MODEL_DIR=./orqa_nq_model
python -m language.orqa.predict.orqa_eval \
  --dataset_path=./resplit/WebQuestions.resplit.test.jsonl \
  --model_dir=$MODEL_DIR

I only see an accuracy of 21%
Accuracy: 0.2101 (427/2032)

This is using the best_default checkpoint:

language/language/orqa/models/orqa_model.py

Lines 508 to 509 in 2d08af4

    
           best_checkpoint_pattern = os.path.join(model_dir, "export", "best_default", 
        
                                                  "checkpoint", "*.index")

The performance reported in the paper is 33.3%

Has anyone else faced this problem?
Did I miss a step?

Thanks,
Priyam

The dataset_path that you used is for WebQuestions, not Natural Questions. Please use resplit/NaturalQuestions.resplit.test.jsonl instead.

	best_checkpoint_pattern = os.path.join(model_dir, "export", "best_default",
	"checkpoint", "*.index")