integer division or modulo by zero
forensicmike opened this issue · 2 comments
- Using Anaconda on windows, I followed the steps in the Setup section all successful.
- Downloaded all 4 files from https://huggingface.co/shi-labs/versatile-diffusion/tree/main/pretrained_pth into the
./pretrained
folder.
When I run Command:
(versatile-diffusion) C:\Users\mike\Desktop\Versatile-Diffusion>python inference.py --gpu 0 --app image-variation --image ..\invokeai\inputs\00003.png --seed 8 --save log\test.png --coloradj simple
I get:
(versatile-diffusion) C:\Users\mike\Desktop\Versatile-Diffusion>python inference.py --gpu 0 --app image-variation --image ..\invokeai\inputs\00003.png --seed 8 --save log\test.png --coloradj simple
Traceback (most recent call last):
File "inference.py", line 565, in <module>
vd_wrapper = vd_inference(pth=pth, fp16=args.fp16, device=device)
File "inference.py", line 35, in __init__
net = get_model()(cfgm)
File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\common\get_model.py", line 87, in __call__
net = self.model[t](**args)
File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\vd.py", line 220, in __init__
super().__init__(*args, **kwargs)
File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\sd.py", line 55, in __init__
highlight_print("Running in {} mode".format(self.parameterization))
File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\model_zoo\sd.py", line 21, in highlight_print
print_log('')
File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\log_service.py", line 16, in print_log
local_rank = sync.get_rank('local')
File "C:\Users\mike\Desktop\Versatile-Diffusion\lib\sync.py", line 35, in get_rank
return global_rank % local_world_size
ZeroDivisionError: integer division or modulo by zero
After reviewing line 35 in sync,py
, it appears that it is dividing by the torch.cuda.device_count()
. I did a little searching and it seems like it is normal/expected for this to return 0 if you have 1 GPU.
If I add in a check,
if local_world_size == 0: return 0
I am able to get past that step.
I think this was due to having a non CUDA enabled torch version installed. Once I had this fixed, I was able to remove the revision and get it to work. Still, the error message associated to this failure wasn't ideal so it might be worth adding some checks?
Correct, the reason is no CUDA so no GPU can be found