Error in UNet validation: IndexError: boolean index did not match indexed array along dimension 0; dimension is 150544 but corresponding boolean dimension is 327184

Question

Error in UNet validation: IndexError: boolean index did not match indexed array along dimension 0; dimension is 150544 but corresponding boolean dimension is 327184

kHarshit opened this issue 5 years ago · 1 comments

I'm trying to train UNet on custom dataset. The training runs fine, but the following error occurs during validation step (when val_interval: 50 is reached).

The config file is as follows:

model:
    arch: unet
data:
    dataset: idd
    train_split: train
    val_split: val
    img_rows: 572
    img_cols: 572
    path: /mnt/Data/Datasets/testing/ 
training:
    n_workers: 0
    train_iters: 100
    batch_size: 2
    val_interval: 50
    print_interval: 25
    loss:
    optimizer:
        name: sgd
        lr: 1.0e-4
    l_rate: 1.0e-4
    lr_schedule:
      name: constant_lr
    momentum: 0.99
    weight_decay: 0.0005
    resume: 
    visdom: False
    augmentations:
      scale: 572

The error is:

RUNDIR: runs/unet_idd/30669
Found 46 train images
Found 83 val images
INFO:ptsemseg:Using default cross entropy loss
INFO:ptsemseg:Using loss <function cross_entropy2d at 0x7f95a6962b70>
/mnt/Data/anaconda3/envs/pyenv/lib/python3.7/site-packages/torch/nn/_reduction.py:46: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
Iter [25/100]  Loss: 3.2779  Time/Image: 1.3911
INFO:ptsemseg:Iter [25/100]  Loss: 3.2779  Time/Image: 1.3911
Iter [50/100]  Loss: 3.2702  Time/Image: 1.4807
INFO:ptsemseg:Iter [50/100]  Loss: 3.2702  Time/Image: 1.4807
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "train.py", line 231, in <module>
    train(cfg, writer, logger)
  File "train.py", line 165, in train
    running_metrics_val.update(gt, pred)
  File "/nfs/interns/kharshit/pytorch-semseg/ptsemseg/metrics.py", line 21, in update
    self.confusion_matrix += self._fast_hist(lt.flatten(), lp.flatten(), self.n_classes)
  File "/nfs/interns/kharshit/pytorch-semseg/ptsemseg/metrics.py", line 15, in _fast_hist
    n_class * label_true[mask].astype(int) + label_pred[mask], minlength=n_class ** 2
IndexError: boolean index did not match indexed array along dimension 0; dimension is 150544 but corresponding boolean dimension is 327184

Answer 1 · 2019-07-23T10:10:48.000Z

The problem was that UNet requires input as 572x572, but outputs mask 388x388, so I had to convert the predicted labels from 388 -> 572 by padding (as suggested by #43 (comment)) as follows:

# In /ptsemseg/metrics.py
def update(self, label_trues, label_preds):
        for lt, lp in zip(label_trues, label_preds):
            # print(lt.shape, lp.shape) # (572, 572), (388, 388)
            lp = np.pad(lp,((92,92),(92,92)), mode='reflect')
            # print(lt.shape, lp.shape) # (572, 572), (572, 572)
            self.confusion_matrix += self._fast_hist(lt.flatten(), lp.flatten(), self.n_classes)