ConnectionError: Couldn't reach 'allenai/c4' on the Hub (ConnectionError)数据集下不下来,怎么回事
Mrgengli opened this issue · 1 comments
Mrgengli commented
Describe the bug
from datasets import load_dataset
print("11")
traindata = load_dataset('ptb_text_only', 'penn_treebank', split='train')
print("22")
valdata = load_dataset('ptb_text_only',
'penn_treebank',
split='validation')
Steps to reproduce the bug
1
Expected behavior
1
Environment info
1
skpig commented
Also cant download "allenai/c4", but with different error reported:
Traceback (most recent call last):
File "/***/lib/python3.10/site-packages/datasets/load.py", line 2074, in load_dataset
builder_instance = load_dataset_builder(
File "/***/lib/python3.10/site-packages/datasets/load.py", line 1795, in load_dataset_builder
dataset_module = dataset_module_factory(
File "/***/lib/python3.10/site-packages/datasets/load.py", line 1659, in dataset_module_factory
raise e1 from None
File "/***/lib/python3.10/site-packages/datasets/load.py", line 1647, in dataset_module_factory
).get_module()
File "/***/lib/python3.10/site-packages/datasets/load.py", line 1069, in get_module
module_name, default_builder_kwargs = infer_module_for_data_files(
File "/***/lib/python3.10/site-packages/datasets/load.py", line 594, in infer_module_for_data_files
raise DataFilesNotFoundError("No (supported) data files found" + (f" in {path}" if path else ""))
datasets.exceptions.DataFilesNotFoundError: No (supported) data files found in allenai/c4
Code to reproduce
dataset = load_dataset("allenai/c4", "en", split="train", streaming=True,trust_remote_code=True,
cache_dir="dataset/en",
download_mode="force_redownload")
Environment
datasets 3.0.1
huggingface_hub 0.25.1