Correct number of Dialogues in TaskMaster 4 (Coffee)?
Opened this issue · 0 comments
kingb12 commented
Hi all! In the paper, I noticed this line saying there were 6,500 TaskMaster Dialogues:
The Taskmaster Coffee dataset consists of 6,500 multi-turn conversations, consisting of 20,000
training examples (conversation turns or API calls), and 3,000 reward examples.
Loading and merging the files in the /data folder though, I only got 3,710. Am I loading these incorrectly?
Here is a snippet of how I was loading them:
DATA_URL: str = "https://github.com/google-research-datasets/Taskmaster/raw/master/TM-4-2024/data/data_0{i}.json"
if __name__ == '__main__':
all_data = []
for i in range(8):
data = requests.get(DATA_URL.format(i=i)).json()
all_data.extend(data)
print(f"Total number of dialogues: {len(all_data)}")
Thank you! If you happen to have an apis.json
and evaluation scripts for Response Generation/API Argument Prediction, that would be helpful as well, though I can also write my own.