allenai/s2orc

fail to download metadata.tsv.gz

ForeverGoGOING opened this issue · 2 comments

Hi, I am very interested in this dataset! but just now I wanted to download the metadata.tsv.gz with the following code
------ download metadata--------
s3_metadata_file ='20190928/metadata.tsv.gz'
local_meta_file = os.path.join(LOCAL_GORC_DIR, 'metadata.tsv.gz')
download_from_s3(bucket, s3_metadata_file, local_meta_file, aws_attribs)
I got a 404 error , is there anything wrong my code?

Just to check, from your shell command line can you try: aws s3 ls s3://ai2-s2-gorc-release/20190928/. Do you see a list of objects?

Does aws s3 cp s3://ai2-s2-gorc-release/20190928/metadata.tar.gz <LOCAL_GORC_DIR> work? (replace <LOCAL_GORC_DIR> with the actual path)

thanks for your help!