Opened this issue a year ago · 1 comments
Thanks for your great work! Are the python codes in Programmatic Answer Generation generated by LLM or written by human?
And one more thing, it looks like that you normalize the format of time to XX:XX in flight-easy.json:
XX:XX
flight-easy.json
However in Combined_Flights_2022.csv, the format is as follows, which will cause mismatch and decrease of EM :
Combined_Flights_2022.csv
EM
Thanks :)