mrqa/MRQA-Shared-Task-2019

The number of examples in the out-of-domain dev set are not accurate

maxsonate opened this issue · 1 comments

For example:
DROP: 1557 vs 1,503
DuoRC.ParaphraseRC: 1648 vs 1,501

Hi,

How are you counting?

If you count using this script you should get the right numbers.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import sys
import json
import gzip

examples = 0
fname = sys.argv[1]

with gzip.open(fname, 'rb') as f:
    for i, line in enumerate(f):
        obj = json.loads(line)

        if i == 0 and 'header' in obj:
            continue

        examples += len(obj['qas'])

print('Num examples: %d' % examples)