google-research/language

OQRA NaturalQuestions pre-trained model accuracy

priyamtejaswin opened this issue · 1 comments

Hi folks,

Thanks a lot for releasing the source code and the pre-trained models!
I was able to setup everything, and run the test evaluation code.

I downloaded the pre-trained NQ checkpoints from gs://orqa-data/orqa_nq_model
When I run

export MODEL_DIR=./orqa_nq_model
python -m language.orqa.predict.orqa_eval \
  --dataset_path=./resplit/WebQuestions.resplit.test.jsonl \
  --model_dir=$MODEL_DIR

I only see an accuracy of 21%
Accuracy: 0.2101 (427/2032)

This is using the best_default checkpoint:

best_checkpoint_pattern = os.path.join(model_dir, "export", "best_default",
"checkpoint", "*.index")

The performance reported in the paper is 33.3%

Has anyone else faced this problem?
Did I miss a step?

Thanks,
Priyam

The dataset_path that you used is for WebQuestions, not Natural Questions. Please use resplit/NaturalQuestions.resplit.test.jsonl instead.