facebookresearch/chameleon

Single GPU bug with python -m chameleon.miniviewer

MaureenZOU opened this issue · 6 comments

One trick for the follow up with one gpu is adding the following two lines in chameleon.py, otherwise it would hang forever.

if world_size > 1:
    dist.broadcast_object_list(to_continue, src=0)
if world_size > 1:
    dist.broadcast_object_list(req, src=0)

Hi,
when using the single gpu , I still got hangup even adding these two lines on inferencing by 30b or 7b.

On loading the model, the returning value of world_size == 0 and no worker was started.

Such a mysterious code of this project :(

Interesting, may check the cuda env, my problem fixed by adding those two lines.

Interesting, may check the cuda env, my problem fixed by adding those two lines.

I hard-code the world-size to 1, then got the error:
File "/data/app/ai/chameleon/chameleon/inference/loader.py", line 52, in load_model with open(src_dir / "params.json", "r") as f: FileNotFoundError: [Errno 2] No such file or directory: '/data/app/ai/models/chameleon/models/30b/params.json'

:(

This seems that you haven't download the checkpoint correctly. Could you please check the folder?

BTW, I am running the miniviewer.

BTW, I am running the miniviewer.

thanks , I missed consolidated.pth . But its size is also huge, oops .