Lightning-Universe/lightning-flash

Issue with `ImageClassificationData.from_dataset`

funnym0nk3y opened this issue ยท 1 comments

๐Ÿ› Bug

There seems to be an issue with ImageClassificationData.from_dataset method. It fails to create the expected format, where the labels can be accessed via datamodul.labels.

To Reproduce

The error occured with the following code adapted from the example

...

datamodule=ImageClassificationData.from_datasets(
    train_dataset=train_dataset,
    val_dataset=valid_dataset,
    batch_size = 32
)


# 2. Build the task
model = ImageClassifier(backbone="efficientnet_b0", labels=datamodule.labels)

...

The datasets are created via

...

train_val_dataset = datasets.ImageFolder(train_val_folder)

....

train_dataset, valid_dataset = random_split(dataset=train_val_dataset, lengths=[no_train_images ,no_valid_images], generator=torch.Generator().manual_seed(42))

Expected behavior

I'd expect the from_dataset method to create a valid datamodule to use for training.

Environment

  • OS (e.g., Linux): Colab instance
  • Python version: 3.8 I guess
  • PyTorch/Lightning/Flash Version (e.g., 1.10/1.5/0.7): 1.13.0+cu116/1.8.6/0.8.1.post0
Borda commented

@funnym0nk3y could you pls share a full example so it is clear what packages and imports you used?