Single GPU bug with python -m chameleon.miniviewer
MaureenZOU opened this issue · 6 comments
One trick for the follow up with one gpu is adding the following two lines in chameleon.py, otherwise it would hang forever.
if world_size > 1:
dist.broadcast_object_list(to_continue, src=0)
if world_size > 1:
dist.broadcast_object_list(req, src=0)
Hi,
when using the single gpu , I still got hangup even adding these two lines on inferencing by 30b or 7b.
On loading the model, the returning value of world_size == 0 and no worker was started.
Such a mysterious code of this project :(
Interesting, may check the cuda env, my problem fixed by adding those two lines.
Interesting, may check the cuda env, my problem fixed by adding those two lines.
I hard-code the world-size to 1, then got the error:
File "/data/app/ai/chameleon/chameleon/inference/loader.py", line 52, in load_model with open(src_dir / "params.json", "r") as f: FileNotFoundError: [Errno 2] No such file or directory: '/data/app/ai/models/chameleon/models/30b/params.json'
:(
This seems that you haven't download the checkpoint correctly. Could you please check the folder?
BTW, I am running the miniviewer.
BTW, I am running the miniviewer.
thanks , I missed consolidated.pth . But its size is also huge, oops .