Story to questions
Closed this issue · 4 comments
Hey, Is there a way to find for each story (as downloaded from http://cs.nyu.edu/~kcho/DMQA/) the corresponding questions?
Thanks.
Both the stories and the questions have a URL in the content of the time so you could match them up based on that.
Best, Lasse
Hey, what do you mean by
content of time
I do see the url in the top of the each question, but not for summaries.
Ahh yes, sorry about that. The filename of the stories are actually a hash of the URL so you should be able to match them up by hashing the URL in the questions.
https://github.com/deepmind/rc-data/blob/master/generate_questions.py#L376
Yes, this is what I did:
I saved each article's questions to a folder with the same name as the story name.
If someone wants to use: (line 575)
url_hash = Hashhex(question_context.url)
s = question_context.ToString()
h = Hashhex(s)
directory = '%s/questions/%s/%s' % (corpus, dataset, url_hash)
if not os.path.exists(directory):
os.makedirs(directory)
with open('%s/%s.question' % (directory, h), 'w') as f:
f.write(s)