huggingface/datasets

ConnectionError: Couldn't reach 'allenai/c4' on the Hub (ConnectionError)数据集下不下来,怎么回事

Mrgengli opened this issue · 1 comments

Describe the bug

from datasets import load_dataset
print("11")
traindata = load_dataset('ptb_text_only', 'penn_treebank', split='train')
print("22")
valdata = load_dataset('ptb_text_only',
'penn_treebank',
split='validation')

Steps to reproduce the bug

1

Expected behavior

1

Environment info

1

Also cant download "allenai/c4", but with different error reported:

Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                                                          
  File "/***/lib/python3.10/site-packages/datasets/load.py", line 2074, in load_dataset                                                                                                                                              
    builder_instance = load_dataset_builder(                                                                                                                                                                                                                  
  File "/***/lib/python3.10/site-packages/datasets/load.py", line 1795, in load_dataset_builder                                                                                                                                      
    dataset_module = dataset_module_factory(                                                                                                                                                                                                                  
  File "/***/lib/python3.10/site-packages/datasets/load.py", line 1659, in dataset_module_factory                                                                                                                                    
    raise e1 from None                                                                                                                                                                                                                                        
  File "/***/lib/python3.10/site-packages/datasets/load.py", line 1647, in dataset_module_factory                                                                                                                                    
    ).get_module()                                                                                                                                                                                                                                            
  File "/***/lib/python3.10/site-packages/datasets/load.py", line 1069, in get_module                                                                                                                                                
    module_name, default_builder_kwargs = infer_module_for_data_files(                                                                                                                                                                                        
  File "/***/lib/python3.10/site-packages/datasets/load.py", line 594, in infer_module_for_data_files                                                                                                                                
    raise DataFilesNotFoundError("No (supported) data files found" + (f" in {path}" if path else ""))                                                                                                                                                         
datasets.exceptions.DataFilesNotFoundError: No (supported) data files found in allenai/c4  

Code to reproduce

dataset = load_dataset("allenai/c4", "en", split="train", streaming=True,trust_remote_code=True,
                        cache_dir="dataset/en",
                       download_mode="force_redownload")

Environment

datasets 3.0.1
huggingface_hub 0.25.1