data file error
Closed this issue · 3 comments
------ code
Domain-pre-training corpora
dpt_corpus_train = 'data/pubmed_subset_train.txt'
dpt_corpus_train_data_selected = 'data/pubmed_subset_train_data_selected.txt'
dpt_corpus_val = 'data/pubmed_subset_val.txt'
Fine-tuning corpora
If there are multiple downstream NLP tasks/corpora, you can concatenate those files together
ft_corpus_train = 'data/BC2GM_train.txt'
----- error part
--2022-12-19 00:51:53-- http://georgian-toolkit.s3.amazonaws.com/transformers-domain-adaptation/colab/files.zip
Resolving georgian-toolkit.s3.amazonaws.com (georgian-toolkit.s3.amazonaws.com)... 54.231.236.185, 52.217.105.132, 54.231.131.113, ...
Connecting to georgian-toolkit.s3.amazonaws.com (georgian-toolkit.s3.amazonaws.com)|54.231.236.185|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-12-19 00:51:53 ERROR 404: Not Found.
unzip: cannot find or open files.zip, files.zip.zip or files.zip.ZIP
Q :
Can't I get each file in this code separately?
Currently, these data files cannot be imported.
Hello,
I am getting the similar error "The specified bucket does not exist". Is there any way to download those files?
Same, I can not find the data in AWS anymore, can we get it somewhere else?
When I try to download the dataset used for this notebook I get the following error :
--2023-02-23 16:12:16-- http://georgian-toolkit.s3.amazonaws.com/transformers-domain-adaptation/colab/files.zip
Resolving georgian-toolkit.s3.amazonaws.com (georgian-toolkit.s3.amazonaws.com)... 52.216.33.89, 3.5.16.206, 3.5.21.124, ...
Connecting to georgian-toolkit.s3.amazonaws.com (georgian-toolkit.s3.amazonaws.com)|52.216.33.89|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-02-23 16:12:16 ERROR 404: Not Found.
I am having the same issue as several of the guys above. Is there another way to get the dataset files?