Is this data set seachable?
russelljjarvis opened this issue · 2 comments
russelljjarvis commented
kyleclo commented
Hi @russelljjarvis, the dataset is distributed as static JSONLines files. We don't provide any search interface on top of it. I suppose it's searchable to the extent that I've used it for:
- Finding papers with a certain metadata field (e.g. papers from ACL or papers that are Computer Science). This is just a simple Python loop through each row and checking its metadata field.
- Finding papers that match a certain regex . This is either using
grep
in bash or with Python; loop through each row checking the title, abstract, body text for a match.
russelljjarvis commented