ValueError: batch_size should be a positive integer value, but got batch_size=0
rnlee1998 opened this issue · 3 comments
when I run python bts.main.py arguments_train_nyu.txt
,I got ValueError: batch_size should be a positive integer value, but got batch_size=0 .what should I do ?
File "bts_main_FAM.py", line 613, in <module>
main()
File "bts_main_FAM.py", line 607, in main
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
File "/data2/liran/anaconda3/envs/torch1.4/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/data2/liran/anaconda3/envs/torch1.4/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 7 terminated with the following error:
Traceback (most recent call last):
File "/data2/liran/anaconda3/envs/torch1.4/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/data2/liran/workspace/bts/pytorch/bts_main_FAM.py", line 405, in main_worker
dataloader = BtsDataLoader(args, 'train')
File "/data2/liran/workspace/bts/pytorch/bts_dataloader.py", line 56, in __init__
sampler=self.train_sampler)
File "/data2/liran/anaconda3/envs/torch1.4/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 219, in __init__
batch_sampler = BatchSampler(sampler, batch_size, drop_last)
File "/data2/liran/anaconda3/envs/torch1.4/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 190, in __init__
"but got batch_size={}".format(batch_size))
ValueError: batch_size should be a positive integer value, but got batch_size=0```
In bts_main.py you'll notice a line:
args.batch_size = int(args.batch_size / ngpus_per_node)
My guess is that your batch_size is smaller than ngpus_per_node. Since int() rounds to the floor, your batch_size = 0.
For example:
batch_size = 3
ngpus_per_node = 4
int(3/4) = 0
Try increasing your batch_size and maybe using multiples of your ngpus_per_node.
In bts_main.py you'll notice a line:
args.batch_size = int(args.batch_size / ngpus_per_node)
My guess is that your batch_size is smaller than ngpus_per_node. Since int() rounds to the floor, your batch_size = 0.
For example:
batch_size = 3
ngpus_per_node = 4
int(3/4) = 0Try increasing your batch_size and maybe using multiples of your ngpus_per_node.
thank you for your advice , I solve it!
I am using single GPU but I got same error.
ValueError: batch_size should be a positive integer value, but got batch_size=0
Can any one help?