[BUG] `workers` Parameter not Respected by DataLoaders

Question

[BUG] `workers` Parameter not Respected by DataLoaders

authman opened this issue 3 years ago · 1 comments

authman commented 3 years ago

Describe the bug
Only 1 thread (core) is used for the dataloaders.

To Reproduce
Steps to reproduce the behavior:

Spin up any of the training examples
Set batch_size to something respectable, like 512
Adjust workers dataloader parameter
Examine CPU utilization

Expected behavior
Multiple cores get engaged and are used to feed the GPU(s).

Screenshots

Desktop (please complete the following information):

OS: Ubuntu 20.04.2 LTS
Graphics 2x GeForce GTX 3090

Additional context

train_dataset_dict = create_dataset_dict(
	data_dir = data_dir, 
	project_name = project_name,  
	center = center, 
	size = train_size, 
	batch_size = train_batch_size, 
	virtual_batch_multiplier = virtual_train_batch_multiplier, 
	normalization_factor= normalization_factor,
	one_hot = one_hot,
	workers=16,
	type = 'train'
)

To help debug, from the same virtual environment I put together this dummy script:

import random
import numpy as np
from torch.utils.data import Dataset
import torch
from tqdm.auto import tqdm

class TestDS(Dataset):
    def __len__(self):
        return 5000

    def __getitem__(self, index):
        z = np.zeros((256*256))
        for i in range(256*256): z[i] = i
        return z
        

val_dataset = TestDS()
val_dataset_it = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=32,
    shuffle=True,
    drop_last=True,
    num_workers=12,
    pin_memory=True
)

while True:
    for i, sample in enumerate(tqdm(val_dataset_it)):
        sample = sample.to('cuda:1')

Running the above results in proper core utilization:

Even adding the following code at the head of EmbSeg training script does not help:

import os
os.environ["MKL_NUM_THREADS"] = "20"
os.environ["OMP_NUM_THREADS"] = "20"

Answer 1 · 2021-12-14T00:46:42.000Z

Closing bug, looks like the issue is the dual for loops in the loss function.