Line 144 of cli.py's `evaluate` function throws an error when tested with scored predictions on squad data
pk1130 opened this issue · 4 comments
Hey @jplalor @EntilZha! When testing all the functions written in cli.py with the scored predictions of the squad data, train
and train_and_evaluate
work well producing the desired training and evaluation results. But when testing the evaluate
function separately by passing in the squad.jsonlines file for TEST_PAIRS_PATH
, the function throws the following error on Line 144:
KeyError: item_id is not a key in the dict
Am I passing in the wrong file into TEST_PAIRS_PATH
? Have also opened issue #18 to clarify what TEST_PAIRS_PATH
refers to. Please respond at your earliest convenience. Thanks a lot!
I can't debug right now, but we should probably make sure we include unit/integration tests for new features to avoid issues like this as well as give some usage examples.
@pk1130 if you have time, it would make for a great PR to debug the issue, create a unit/integration test that reproduces the issue, and create a fix for it.
Would love to help out! I'll take a look at it when I wake up tomorrow morning :) Just to confirm @EntilZha, in place of the argument TEST_PAIRS_PATH
I have to pass in the squad.jsonlines file path right?
The data format for evaluate
is slightly different than the squad example as implemented. The expectation is that you have:
- A set of learned subject and item parameters
- (subjectID, itemID) pairs that you want to estimate
p(correct)
for.
TEST_PAIRS_PATH
refers to a new file of jsonlines that have the format:
{"subject_id": "s1", "item_id": "q1"}
{"subject_id": "s2", "item_id": "q1"}
{"subject_id": "s1", "item_id": "q3"}
This way you only get predictions for subject-items pairs that are specified (e.g., a previously held out test set).
Closing this as I think we're all set.