nd-ball/py-irt

Question about input and output of example_with_rps

Closed this issue · 3 comments

I have two question to ask and I'm expecting your kind answer.
1.Could you please tell me about the meaning of your input of the example_with_rps? specifically, the meaning of modelID,itemID, and response?
2.And also may I ask what is the meaning of the output of example_with_rps.py?
Thank you very much!

Hi @EntilZha and @jplalor! @Xiaoqianhou and I are part of a research internship with Dr. Jordan Boyd-Graber at UMD, working on Improving QA systems with IRT models. As a first step for the same, we wanted to run a basic IRT model with an existing dataset like SQuAD or the NQ dataset. We see that example.py and example_with_rps.py have some code to run with JSON data, but we haven't been able to decipher the format or usage from the code/documentation. Could you please help us with a few preliminary examples on any existing dataset or subset of a dataset so that we could understand the usage? Thanks a ton!

Hi, I updated the example_with_rps file to show the example with a small data set. In this example, the data is a binary matrix. Each row is a subject, each column is an item, and each cell indicates whether the subject answered the item correctly.

ModelID corresponds to a subject (e.g., a particular QA model) and itemID corresponds to a particular item (e.g., a particular question in SQuAD). Lines 77-79 get the data into the import format for the IRT models.

We're working on more documentation/examples but in the meantime hopefully that helps. If you have other questions let us know.

I’ll add a few quick docs today, but you can also look at the unit test for the 4PL model training, the file format in the test_fixtures folder, and the description here

def from_jsonlines(cls, data_path: Path):