Lightning-Universe/lightning-flash

`flash.DataModule` can't load files with a pair of square brackets in the filename

yurijmikhalevich opened this issue ยท 0 comments

๐Ÿ› Bug

flash.DataModule can't load files with a pair of square brackets in the filename, like sa_Yuzu Pop A [M] Extra Light_dakufalse.jpeg. Attempting to do so fails with an error:

Traceback (most recent call last):
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
    self._run(model, ckpt_path=self.ckpt_path)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
    results = self._run_stage()
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
    self._run_train()
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
    self.fit_loop.run()
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 188, in advance
    batch = next(data_fetcher)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 184, in __next__
    return self.fetching_function()
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 265, in fetching_function
    self._fetch_next_batch(self.dataloader_iter)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/utilities/fetching.py", line 280, in _fetch_next_batch
    batch = next(iterator)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/supporters.py", line 568, in __next__
    return self.request_next_batch(self.loader_iters)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/pytorch_lightning/trainer/supporters.py", line 580, in request_next_batch
    return apply_to_collection(loader_iters, Iterator, next)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 47, in apply_to_collection
    return function(data, *args, **kwargs)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 671, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/core/data/io/input.py", line 294, in __getitem__
    return self._call_load_sample(self.data[index])
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/core/data/io/input.py", line 180, in _call_load_sample
    sample_output = getattr(self, f"{_STAGES_PREFIX[self.running_stage]}_load_sample")(_deepcopy_dict(sample))
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/core/data/io/input.py", line 257, in train_load_sample
    return self.load_sample(sample)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/image/classification/input.py", line 59, in load_sample
    sample = super().load_sample(sample)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/image/data.py", line 76, in load_sample
    sample[DataKeys.INPUT] = load_image(filepath)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/core/data/utilities/loading.py", line 154, in load_image
    return load(file_path, _image_loaders)
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/flash/core/data/utilities/loading.py", line 144, in load
    with fsspec.open(file_path) as file:
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/fsspec/core.py", line 441, in open
    return open_files(
  File "/Users/yurij/miniforge3/envs/japanese-reader/lib/python3.8/site-packages/fsspec/core.py", line 195, in __getitem__
    out = super().__getitem__(item)
IndexError: list index out of range

cc @OstaptsovDanil

To Reproduce

  1. Copy the quickstart example from here: https://lightning-flash.readthedocs.io/en/stable/reference/image_classification.html
  2. Rename any image in the dataset to contain [M]
  3. Run the example
  4. See it fail with the error described above

Expected behavior

No error.

Environment

  • OS (e.g., Linux): macOS
  • Python version: 3.8.15
  • PyTorch/Lightning/Flash Version (e.g., 1.10/1.5/0.7): torch==1.13.1, flash==0.8.1
  • GPU models and configuration: none
  • Any other relevant information: n/a