[BUG]Loading a dataset cached in a LocalFileSystem is not supported
xiaohangguo opened this issue · 2 comments
xiaohangguo commented
最近回报一个莫名其妙的数据类型不支持的错误,我看了一下,是datasets版本的问题。
10/29/2023 11:23:38 - WARNING - datasets.builder - Found cached dataset json (file:///public/home/lvshuhang/.cache/huggingface/datasets/json/default-01eed702bb47992a/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51)
Traceback (most recent call last):
File "/public/home/lvshuhang/LMFlow-main/examples/finetune.py", line 61, in <module>
main()
File "/public/home/lvshuhang/LMFlow-main/examples/finetune.py", line 53, in main
dataset = Dataset(data_args)
File "/public/home/lvshuhang/LMFlow-main/src/lmflow/datasets/dataset.py", line 104, in __init__
raw_dataset = load_dataset(
File "/public/home/lvshuhang/miniconda3/envs/lmflow/lib/python3.9/site-packages/datasets/load.py", line 1794, in load_dataset
ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory)
File "/public/home/lvshuhang/miniconda3/envs/lmflow/lib/python3.9/site-packages/datasets/builder.py", line 1089, in as_dataset
raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).__name__} is not supported.")
NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.
10/29/2023 11:23:40 - WARNING - datasets.builder - Found cached dataset json (file:///public/home/lvshuhang/.cache/huggingface/datasets/json/default-01eed702bb47992a/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51)
Traceback (most recent call last):
File "/public/home/lvshuhang/LMFlow-main/examples/finetune.py", line 61, in <module>
main()
File "/public/home/lvshuhang/LMFlow-main/examples/finetune.py", line 53, in main
dataset = Dataset(data_args)
File "/public/home/lvshuhang/LMFlow-main/src/lmflow/datasets/dataset.py", line 104, in __init__
raw_dataset = load_dataset(
File "/public/home/lvshuhang/miniconda3/envs/lmflow/lib/python3.9/site-packages/datasets/load.py", line 1794, in load_dataset
ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory)
File "/public/home/lvshuhang/miniconda3/envs/lmflow/lib/python3.9/site-packages/datasets/builder.py", line 1089, in as_dataset
raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).__name__} is not supported.")
类似问题:huggingface/datasets#6352
解决方案:
pip install -U datasets
如果可以的话,希望lmlfow更新一下requirements.txt
research4pan commented
Thanks for your interest in LMFlow! We will update it soon after testing. Also, if you are interested in contributing via PR, we welcome all kinds of contributions to help us together improve the repository. Thanks! 😄
xiaohangguo commented
datasets has been updated to 2.14.6 ,haha.